Previous Thread
Next Thread
Print Thread
Page 2 of 2 1 2
Joined: Sep 2006
Posts: 28,201
Legend
Online
Legend
Joined: Sep 2006
Posts: 28,201
Quote:

So I've been researching into waitfree algorithms like waitfreestacks where you can place objects into a LIFO structure across N threads without locks and still be thread safe.

Nowadays with processors like the Core i7 having 8 or more logical cores, not knowing how to distribute your tasks is going to leave you in the dust performance wise, so I'm learning it, and got some good personal code running on it with good results so far. You can still deadlock or livelock in .NET but it does help catch many of the other gotchas you may run into like editing windows controls from multiple thread contexts.




Can you share some resources that you've found that were helpful?
I've done some threading in a few utilities I've written, but I've mostly relied on locks to this point to help control access to objects.


Browns is the Browns

... there goes Joe Thomas, the best there ever was in this game.

Joined: Nov 2006
Posts: 3,259
Hall of Famer
Offline
Hall of Famer
Joined: Nov 2006
Posts: 3,259
Atomic operations are the fundamental part of it (winapi their called InterlockedIncrement, InterlockedDecrement, etc)

http://msdn.microsoft.com/en-us/library/ms686360(v=VS.85).aspx

A larger construct they have is Interlocked Linked Lists (really a stack)

http://msdn.microsoft.com/en-us/library/ms686962(v=VS.85).aspx

These exist in .NET as well


#gmstrong
Joined: Sep 2006
Posts: 28,201
Legend
Online
Legend
Joined: Sep 2006
Posts: 28,201
Groovy, thanks


Browns is the Browns

... there goes Joe Thomas, the best there ever was in this game.

Joined: Nov 2006
Posts: 3,259
Hall of Famer
Offline
Hall of Famer
Joined: Nov 2006
Posts: 3,259
no problem!

As an example I wrote a thread job that could iterate over elements in an array, where each thread (8 total) ran the same function like so, with the same counter:

volatile LONG m_Counter = 0;

while( true )
{
int index = InterlockedIncrement( &m_Counter );

if( index > arraySize )
break;

operateOnArrayElement( mArray[index-1] );
}

My sample data took 3ms to complete in singlethreaded mode, distributing across 8 threads dropped that down to 1ms, and its one of those things where as the data gets heavier your overall speed improvements get better and better! It's basically a C++ version of the .NET Parallels.For method.


#gmstrong
Joined: Sep 2006
Posts: 28,201
Legend
Online
Legend
Joined: Sep 2006
Posts: 28,201
very cool... and because InterlockedIncrement() is handling the sync'ing for you, you don't have to sweat the details about two threads perhaps getting the same index?.... very nice

Of course, it seems that you could just write a quick-n-dirty lock function that does the same thing - block other threads by taking out a lock on the counter, increment a counter, unblock and return the value. I would think that the performance difference between the two would be nearly identical.


Browns is the Browns

... there goes Joe Thomas, the best there ever was in this game.

Joined: Sep 2006
Posts: 4,480
C
Hall of Famer
Offline
Hall of Famer
C
Joined: Sep 2006
Posts: 4,480
Here is a pretty good article comparing normal sequential processing vs. Parallel.for and CUDA GPU if you are interested:
http://www.c-sharpcorner.com/UploadFile/rafaelwo/4398/

I recoded the c portions using a CUDA .NET class I found but the results are similar.


#gmstrong
Joined: Sep 2006
Posts: 28,201
Legend
Online
Legend
Joined: Sep 2006
Posts: 28,201
That's something I'd like to play with, but I just need a project to play with that holds my interest to give me a goal to work on.

It strikes me that CUDA is best used with massive parallelism and not so much something that only has a few things going on.


Browns is the Browns

... there goes Joe Thomas, the best there ever was in this game.

Joined: Nov 2006
Posts: 3,259
Hall of Famer
Offline
Hall of Famer
Joined: Nov 2006
Posts: 3,259
To Prp, you can do the same thing with critical sections of course, but when you call the interlocked functions, it calls asm:

http://www.codemaestro.com/reviews/8

the interlocked functions are basically macros depending on if your targetting x86, x64, or itanium (for windows at least). A critical section is a kernel function. You can do the same thing with critical sections but your going to be costing yourself an order of magnitude more cycles to do it. For a game to run at 60fps you need to complete your game loop in 16ms. Thats not a lot of time, so anytime you can save cycles like that is a big win

And ya while a CPU based threadsystem these days is what, 12 threads max? On GPGPU code you need at least scale to 1024 otherwise your wasting a ton of pipelines running your work, thats why you dont see the throughput gains until you can get all the pipelines running at once.


#gmstrong
Joined: Sep 2006
Posts: 30,826
A
Legend
Offline
Legend
A
Joined: Sep 2006
Posts: 30,826
j/c

There seems to be an issue with the board. It only happens in this thread - but for some reason, I can barely make out any english in this thread - the rest seems to be some foreign language I've never heard of.....

Joined: Sep 2006
Posts: 15,015
F
Legend
Offline
Legend
F
Joined: Sep 2006
Posts: 15,015


We don't have to agree with each other, to respect each others opinion.
Joined: Nov 2006
Posts: 3,259
Hall of Famer
Offline
Hall of Famer
Joined: Nov 2006
Posts: 3,259
it is a foreign language, C++


#gmstrong
Joined: Sep 2006
Posts: 14,248
Legend
Offline
Legend
Joined: Sep 2006
Posts: 14,248
If witty response then
witty response
else
ignore
end if

Page 2 of 2 1 2
DawgTalkers.net Forums DawgTalk Tailgate Forum Need Help - I Wanna Learn Programming

Link Copied to Clipboard
Powered by UBB.threads™ PHP Forum Software 7.7.5