--_9ac6f212-c99d-4514-b4a0-24d1a2cbed51_
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable> From: d@drobilla.net
Very interesting.
You might be running into some basic scheduler weirdness here though
and not something inherently wrong with the POSIX queues. I ran your
code here a few times in some different configurations. The results with 1M=
=20
messages had wild variance with SCHED_FIFO=2C sometimes 2s=2C 4s=2C 6s=2C e=
tc.=20
Not reliable - although without rescheduling they did seem more consistent.=
=20
These below are with 10M to give longer run times:
a. no SCHED_FIFO=2C 10M cycles
[nicky@fidelispc] /tmp [65] cc ipc.c -lrt=20
[nicky@fidelispc] /tmp [66] ./a.out 4096 10000000
Sending a 4096 byte message 10000000 times.
Pipe recv time: 23.220948
Pipe send time: 23.220820
Queue recv time: 13.949289
Queue send time: 13.949226
b. SCHED_FIFO=2C again 10M cycles.
[nicky@fidelispc] /tmp [69] cc ipc.c -lrt -DSET_RT_SCHED=3D1
[nicky@fidelispc] /tmp [70] ./a.out 4096 10000000
Sending a 4096 byte message 10000000 times.
Pipe send time: 34.514288
Pipe recv time: 34.514404
Queue send time: 19.004525
Queue recv time: 19.004427
This was on a dual core laptop=2C 2.2GHz=2C no speed stepping=2C was also=20
watching the top whilst running this.=20
Without FIFO the system CPU spreads across both cores=2C they both run up
towards 100% load for both IPC methods.
With SCHED_FIFO/pipe the load does not distribute - I get 94% system load=20
on a single CPU whilst running through the loop.=20
The POSIX code did not show this effect.
Odd results so had a look at vmstat rather than top=2C this gives some indi=
cation:
No rescheduling:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu--=
--
1 0 0 6506924 312036 595456 0 0 0 196 750 1569 5 94 0=
0
1 0 0 6489188 312036 613344 0 0 0 0 961 25508 8 88 =
3 0
1 0 0 6488724 312036 613536 0 0 0 0 991 21070 6 92 =
2 0
1 0 0 6488812 312036 613552 0 0 0 0 697 1446 5 94 1=
0
SCHED_FIFO pipe():
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu--=
--
2 0 0 6516912 311372 586924 0 0 0 108 569 435180 5 46 =
44 4
2 0 0 6516272 311372 586972 0 0 0 0 556 436042 3 47 =
50 0
1 0 0 6516608 311372 586972 0 0 0 0 548 436482 6 46 =
48 0
1 0 0 6516928 311372 586924 0 0 0 0 563 435930 2 51 =
48 0
Ouch. Almost 100 times the number of context switches. Is the whole kernel
bound up in a single thread doing process context switches?
SCHED_FIFO message queues - generally lower=2C far more variance:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu--=
--
1 0 0 6510328 316116 587048 0 0 0 0 427 142445 2 83 =
15 0
1 1 0 6509736 316120 587044 0 0 0 44 440 795 3 97 1=
0
1 0 0 6509900 316132 587052 0 0 0 64 439 281037 9 69 =
22 0
1 0 0 6509436 316132 587048 0 0 0 0 437 796 3 95 2=
0
1 0 0 6509868 316160 587020 0 0 0 164 452 151290 5 81 =
15 0
Vanilla kernel Linux fidelispc 2.6.32-35-generic #78-Ubuntu SMP Tue=20
Oct 11 16:11:24 UTC 2011 x86_64 GNU/Linux
Also tested with unbalanced priorities in the sender and receiver=2C and wi=
th only
prioritising one of them=2C pretty much the same as 90/90.
Not sure if that helps any. I have another system with a single core=2C mig=
ht try it
out there later since my results were very different.
Regards=2C nick.
"we have to make sure the old choice [Windows] doesn't disappear=94.
Jim Wong=2C president of IT products=2C Acer
> From: d@drobilla.net
--_9ac6f212-c99d-4514-b4a0-24d1a2cbed51_
Content-Type: text/html; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
>=3B From: d@drobilla.net>=3B To: linux-audio-dev@lists.linuxaudio.=
org>=3B Date: Thu=2C 24 Nov 2011 19:10:26 -0500>=3B Subject: [L=
AD] Pipes vs. Message Queues>=3B >=3B I got curious=2C so I bas=
hed out a quick program to benchmark pipes vs>=3B POSIX message queue=
s. It just pumps a bunch of messages through the>=3B pipe/queue in a=
tight loop. The results were interesting:Very interesting.You might be running into some basic scheduler weirdness here thoughan=
d not something inherently wrong with the POSIX queues. I ran yourcode =
here a few times in some different configurations. The results with 1M =
messages had wild variance with SCHED_FIFO=2C sometimes 2s=2C 4s=2C 6s=2C e=
tc. Not reliable - although without rescheduling they did seem more con=
sistent. These below are with 10M to give longer run times:=
a. no SCHED_FIFO=2C 10M cycles[nicky@fidelispc] /tmp [65] cc ipc.c =
-lrt [nicky@fidelispc] /tmp [66] ./a.out 4096 10000000Sending a 409=
6 byte message 10000000 times.Pipe recv time: =3B 23.220948Pipe=
send time: =3B 23.220820Queue recv time: 13.949289Queue send t=
ime: 13.949226b. SCHED_FIFO=2C again 10M cycles.[nicky@fide=
lispc] /tmp [69] cc ipc.c -lrt -DSET_RT_SCHED=3D1[nicky@fidelispc] /tmp=
[70] ./a.out 4096 10000000Sending a 4096 byte message 10000000 times.<=
br>Pipe send time: =3B 34.514288Pipe recv time: =3B 34.514404Queue send time: 19.004525Queue recv time: 19.004427This was =
on a dual core laptop=2C 2.2GHz=2C no speed stepping=2C was also watchi=
ng the top whilst running this. Without FIFO the system CPU spreads=
across both cores=2C they both run up
towards 100% load for both IPC methods.
With SCHED_FIFO/pipe the load does not distribute - I get 94% system lo=
ad on a single CPU whilst running through the loop. The POSIX c=
ode did not show this effect.Odd results so had a look at vmstat ra=
ther than top=2C this gives some indication:No rescheduling:pro=
cs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----<=
br> =3B1 =3B 0 =3B =3B =3B =3B =3B 0 6506924 31=
2036 595456 =3B =3B =3B 0 =3B =3B =3B 0 =3B&nbs=
p=3B =3B =3B 0 =3B =3B 196 =3B 750 1569 =3B 5 94&nb=
sp=3B 0 =3B 0 =3B1 =3B 0 =3B =3B =3B =3B&nb=
sp=3B 0 6489188 312036 613344 =3B =3B =3B 0 =3B =3B&nbs=
p=3B 0 =3B =3B =3B =3B 0 =3B =3B =3B =3B 0&=
nbsp=3B 961 25508 =3B 8 88 =3B 3 =3B 0 =3B1 =3B 0&n=
bsp=3B =3B =3B =3B =3B 0 6488724 312036 613536 =3B =
=3B =3B 0 =3B =3B =3B 0 =3B =3B =3B =3B 0&n=
bsp=3B =3B =3B =3B 0 =3B 991 21070 =3B 6 92 =3B 2&n=
bsp=3B 0 =3B1 =3B 0 =3B =3B =3B =3B =3B 0 6=
488812 312036 613552 =3B =3B =3B 0 =3B =3B =3B 0&nb=
sp=3B =3B =3B =3B 0 =3B =3B =3B =3B 0 =3B 6=
97 1446 =3B 5 94 =3B 1 =3B 0SCHED_FIFO pipe():procs=
-----------memory---------- ---swap-- -----io---- -system-- ----cpu---- =3B2 =3B 0 =3B =3B =3B =3B =3B 0 6516912 3113=
72 586924 =3B =3B =3B 0 =3B =3B =3B 0 =3B =
=3B =3B =3B 0 =3B =3B 108 =3B 569 435180 =3B 5 46 4=
4 =3B 4 =3B2 =3B 0 =3B =3B =3B =3B =3B =
0 6516272 311372 586972 =3B =3B =3B 0 =3B =3B =3B 0=
 =3B =3B =3B =3B 0 =3B =3B =3B =3B 0 =
=3B 556 436042 =3B 3 47 50 =3B 0 =3B1 =3B 0 =3B&nbs=
p=3B =3B =3B =3B 0 6516608 311372 586972 =3B =3B =
=3B 0 =3B =3B =3B 0 =3B =3B =3B =3B 0 =3B&n=
bsp=3B =3B =3B 0 =3B 548 436482 =3B 6 46 48 =3B 0&n=
bsp=3B1 =3B 0 =3B =3B =3B =3B =3B 0 6516928 311372 =
586924 =3B =3B =3B 0 =3B =3B =3B 0 =3B =3B&=
nbsp=3B =3B 0 =3B =3B =3B =3B 0 =3B 563 435930 =
=3B 2 51 48 =3B 0Ouch. Almost 100 times the number of context s=
witches. Is the whole kernelbound up in a single thread doing process c=
ontext switches?SCHED_FIFO message queues - generally lower=2C far =
more variance:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu--=
-- =3B1 =3B 0 =3B =3B =3B =3B =3B 0 6510328=
316116 587048 =3B =3B =3B 0 =3B =3B =3B 0 =3B&=
nbsp=3B =3B =3B 0 =3B =3B =3B =3B 0 =3B 427 142=
445 =3B 2 83 15 =3B 0 =3B1 =3B 1 =3B =3B =
=3B =3B =3B 0 6509736 316120 587044 =3B =3B =3B 0 =
=3B =3B =3B 0 =3B =3B =3B =3B 0 =3B =3B&nbs=
p=3B 44 =3B 440 =3B 795 =3B 3 97 =3B 1 =3B 0 =
=3B1 =3B 0 =3B =3B =3B =3B =3B 0 6509900 316132 587=
052 =3B =3B =3B 0 =3B =3B =3B 0 =3B =3B&nbs=
p=3B =3B 0 =3B =3B =3B 64 =3B 439 281037 =3B 9 69 2=
2 =3B 0 =3B1 =3B 0 =3B =3B =3B =3B =3B =
0 6509436 316132 587048 =3B =3B =3B 0 =3B =3B =3B 0=
 =3B =3B =3B =3B 0 =3B =3B =3B =3B 0 =
=3B 437 =3B 796 =3B 3 95 =3B 2 =3B 0 =3B1 =3B 0=
 =3B =3B =3B =3B =3B 0 6509868 316160 587020 =3B&nb=
sp=3B =3B 0 =3B =3B =3B 0 =3B =3B =3B =3B 0=
 =3B =3B 164 =3B 452 151290 =3B 5 81 15 =3B 0Va=
nilla kernel Linux fidelispc 2.6.32-35-generic #78-Ubuntu SMP Tue Oct 1=
1 16:11:24 UTC 2011 x86_64 GNU/LinuxAlso tested with unbalanced pri=
orities in the sender and receiver=2C and with onlyprioritising one of =
them=2C pretty much the same as 90/90.Not sure if that helps any. I=
have another system with a single core=2C might try itout there later =
since my results were very different.Regards=2C nick."we ha=
ve to make sure the old choice [Windows] doesn't disappear=94.Jim Wong=
=2C president of IT products=2C Acer>=3B From: d@drobill=
a.net>=3B To: linux-audio-dev@lists.linuxaudio.org>=3B Date: Th=
u=2C 24 Nov 2011 19:10:26 -0500>=3B Subject: [LAD] Pipes vs. Message =
Queues>=3B >=3B I got curious=2C so I bashed out a quick progra=
m to benchmark pipes vs>=3B POSIX message queues. It just pumps a bu=
nch of messages through the>=3B pipe/queue in a tight loop. The resu=
lts were interesting:>=3B >=3B $ ./ipc 4096 1000000>=3B S=
ending a 4096 byte message 1000000 times.>=3B Pipe recv time: 6.8811=
04>=3B Pipe send time: 6.880998>=3B Queue send time: 1.938512<=
br>>=3B Queue recv time: 1.938581>=3B >=3B Whoah. Which made=
me wonder what happens with realtime priority>=3B (SCHED_FIFO priori=
ty 90):>=3B >=3B $ ./ipc 4096 1000000>=3B Sending a 4096 =
byte message 1000000 times.>=3B Pipe send time: 5.195232>=3B P=
ipe recv time: 5.195475>=3B Queue send time: 5.224862>=3B Queu=
e recv time: 5.224987>=3B >=3B Pipes get a bit faster=2C and PO=
SIX message queues get dramatically>=3B slower. Interesting.>=
=3B >=3B I am opening the queues as blocking here=2C and both sender =
and receiver>=3B are at the same priority=2C and aggressively pumping=
the queue as fast as>=3B they can=2C so there is a lot of competitio=
n and this is not an especially>=3B good model of any reality we care=
about=2C but it's interesting>=3B nonetheless.>=3B >=3B =
The first result really has me thinking how much Jack would benefit from>=3B using message queues instead of pipes and sockets. It looks like>=3B there's definitely potential here... I might try to write a more>=3B scientific benchmark that better emulates the case Jack would care=
about>=3B and measures wakeup latency=2C unless somebody beats me to=
it. That test>=3B could have the shm + wakeup pattern Jack actually=
uses and benchmark it>=3B vs. actually firing buffer payload over me=
ssage queues...>=3B >=3B But I should be doing more pragmatic t=
hings=2C so here's this for now :)>=3B >=3B Program is here: ht=
tp://drobilla.net/files/ipc.c>=3B >=3B Cheers=2C>=3B =
>=3B -dr>=3B >=3B ___________________________________________=
____>=3B Linux-audio-dev mailing list>=3B Linux-audio-dev@lists=
.linuxaudio.org>=3B http://lists.linuxaudio.org/listinfo/linux-audio-=
dev
=
--_9ac6f212-c99d-4514-b4a0-24d1a2cbed51_--
LINUX® is a registered trademark of Linus Torvalds in the USA and other countries.
Linuxaudio.org logo copyright Thorsten Wilms © 2006.
Hosting provided by the Virginia Tech Department of Music and DISIS.