thanks for all your input, Ill try and summarize here.
Hear us at http://ponderworthy.com <http://ponderworthy.com/> -- CDs and MP3 now available! <http://ponderworthy.com/ad-astra/ad-astra.html>
Music of compassion; fire, and life!!!
First of all, booting into console mode, rather than running the full blown desktop seemed to eliminate most of the problems, although its still not quite a stable as id like.
Also i dont quite understand how all of that could interfere with my RT-thread.
This was going to try and install a more minimal system anyway, and dont need a graphical environment for this, but during developments its kind of nice to have.
I still would like to see how far i can take this, and was really hoping i can continuously use 80-90% of all cpu cores without dropouts
Is that realistic with a lowlatency kernel?
Do you lock all memory used by your RT threads ?
If you don't and the system is configured for high swappiness
 this sort of thing could happen.
I'm routinely running big real-time convolution matrices without
problems, so it's certainly possible.
 <https://en.wikipedia.org/wiki/Swappiness <https://en.wikipedia.org/wiki/Swappiness>>
I am not currently locking memory. I thought a had plenty of ram, as not to cause any swapping, but i guess its good practice to wire memory, so i will give it a try.
Bad kernel driver? WIFI drivers are known bad for things like this. An interupt driver can block if it is designed badly. I found on one machine I had to unload the the kernel module for my wifi as it actually created more problems when I turned the power off to the tx than when it was on. (it seems to me on my wifi, when it was turned on I got xruns every 5 seconds, but with it turned off it was every half second or so... sounds very close to 0.6, unloading the kernel module fixed it)
Cron should also be turned off, but that is probably not the problem here. Cron runs super "nice" but there seem to be some things it does like packge update that can cause problems too. I turn off cron while recording.
I dont have a wireless on my machine, nor an nvidia card. just intel builtin graphics. This where my linux knowledge falls short, but If i dont have that hardware, can I assume no drivers for it are loaded?
AFAIK, the important things are.
1. Use a properly configured realtime patched kernel.
lowlatency-kernel is not going to cut it?
I wasnt really able to find to much info on the difference between the two, other than than the rt-kernel is a step up and hard realtime vs soft.
But nothing on how this is technically achieved
2. Set a high priority of the soundcard interrupt, something like 97 is
a good value. (If using a USB soundcard, set the priority of the
interrupt servicing the USB hub instead).
3. Run Jack with realtime and memlocking enabled and at a priority of
Im not running jack but rather using alsa directly/
4. Make sure that you don't have any hardware/drivers that play havoc
with your kernel scheduling. some WIFI adapters, NVIDIA, etc comes to
5. Make sure that the system isn't suffering from SMI/NMIs which
preempt the kernel and can take a long time to execute. This can be
done with hwlatdetect script in the rt-tests package.
6. Use cyclictest from rt-tests to confirm that there are no latency
spikes in how the kernel schedules threads.
Possibly hyperthreading, cpu power management, etc could cause
problems, and I don't have experience with all hardware out there, but
IME on modern Intel hardware this isn't a problem.
I did actually find that hyperthreading had an impact, turing it of made every thing much more predictable.
JACK2 also has a very nice profiling tool that can give a good idea
about what is going on with the soundcard interrupt, clients, etc.
Keep an eye on the interrupts while its all running, particularly
Non-maskable interrupts. Try to correlate them with the 0.6 sec
of the glitches if possible;
watch -n 0.1 cat /proc/interrupts
I've written up some of the checks I generally do, perhaps browse
that to see if there's anything there that you could check?
Thats all I can think of at the moment, -Harry
Heres the output of cat /proc/interrupts:
CPU0 CPU1 CPU2 CPU3
0: 57 0 0 0 IO-APIC-edge timer
1: 3 0 0 0 IO-APIC-edge i8042
7: 44 0 0 0 IO-APIC-edge
8: 1 0 0 0 IO-APIC-edge rtc0
9: 3 0 0 0 IO-APIC-fasteoi acpi
12: 4 0 0 0 IO-APIC-edge i8042
16: 0 0 0 0 IO-APIC 16-fasteoi madifx
121: 7074 0 0 341 PCI-MSI-edge xhci_hcd
122: 13001 25946 0 342 PCI-MSI-edge 0000:00:17.0
123: 3409 0 0 0 PCI-MSI-edge eth0
124: 171029 0 0 0 PCI-MSI-edge i915_bpo
125: 4805 0 0 0 PCI-MSI-edge snd_hda_intel
NMI: 17 12 13 14 Non-maskable interrupts
LOC: 544121 436328 444080 462821 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
PMI: 17 12 13 14 Performance monitoring interrupts
IWI: 0 0 0 0 IRQ work interrupts
RTR: 3 0 0 0 APIC ICR read retries
RES: 13051 11975 11216 8004 Rescheduling interrupts
CAL: 613 547 560 526 Function call interrupts
TLB: 640 767 676 535 TLB shootdowns
TRM: 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 Machine check exceptions
MCP: 31 31 31 31 Machine check polls
HYP: 0 0 0 0 Hypervisor callback interrupts
the local timer interrupts are getting fired all the time, but i guess they should.
123 eth0 is also updated rather often. But the one thats closed to 0.6s seems to be:
122: 13001 26147 0 342 PCI-MSI-edge 0000:00:17.0
But is there anything a can do about that?