Friday, October 30, 2009

OSSP mm - Shared Memory Allocation

i played a few days with the mm shared memory lib. i must say i liked it!
it is easy to understand and easy to implement into existing code.
of course the lack of new()/delete() is somewhat limiting, but not that difficult to overcome in most cases.
alternatively, on our project we were looking into boost:interprocess, but i was kinda reluctant to use it, mostly because of the syntax (btw i found a boost/ace alternative at POCO, the users and designers say it provides boost/ace-a-like functionality, even if it cannot compare in perfomance, but the best gain is the C#-oriented syntax; probably worth it to examine).

there was one really weird problem with mm though, when i got the impression that mm_lock() wasnt working.
the workflow was a follows:
we have a hash located on the shared memory segment and before i get() from it i mm_lock it and subsequently mm_unlock() it. still i noticed that many times same value was being written to the hash, which was unexpected by design.
i fired our test up to create a few working processes and breaked in each one before a get(). the locks seemed to work, but after a put() suddenly the lock seemed to vanish in the haze?! digging deeper i found out that in put() we of course try to allocad some new memory in the shared mem segment and traceing down into mm_malloc i found out it was trying to lock too, and when done - calling mm_unlock()! piece of bad design or weird side-effect? either way, apart from that my impressions with OSSP mm were great. however it turned out that the boost guys go a few more tricks up in their sleeves, especially the name persisted storage seemed a very attractive idea for our purposes and we eventually decided to use the boost approach.
so...moving on and up again :)

Monday, October 19, 2009

random performace thoughts

just wanted to lay things out so i can see them more clearly. if it doesn't make sense to you, dear readers, please excuse me :)

so, we have many hashes (maps, containing cached data), but due to our implementation with fork() all the data is lost once a particular task is done. i had the idea of storing all those in a berkeley db so that every new forked task gets it from there and doesn't have to fill the hash by itself, but then - what is a task gonna read from the db when it doesn't know what it is gonna need beforehand. we cannot read everything as the data is HUGE and we cant read bit by bit as this slows down performance a lot - we had a test case like this: we tried using memcached as a remote cache, but this slowed us down even if we read from it only once into a local hash and all next calls where local (quick); if we hadn't read into local hash first, the slowdown would be even more devastating.

one possible solution is for the main process to reread all from the stored data and all children would get it via fork/copy-on-write and then put any changes/additions back into the db for the parent to read when a new task requests start.

i have another one around - when if we realise what we need from the cache much earlier than when we need it? we could then start a thread to pre-load the expected data while the main thread goes on processing and when it reached the point when the hash is needed - voila! its all up there! in case there are gaps they can be filled runtime, should be a small percentage of the overall effort

i'm working on a shared memory solution, but all these got me into doubt about the performance hit, similar to hitting the memcached everytime (by the way, i found out about OSSP mm shared memory allocation project at http://www.ossp.org/pkg/lib/mm/, and it looks really good. i dont know how i missed it earlier, guess i was looking for a std-kind-of allocator and got ptu back by some messages that boost had given up on the idea. another btw - is the new std with C++x0 gonna include something like this? from what i read it does include some fancy new stuff related to multithreaded and multicore programming which is awesome. would be great once we get our hands on it)

so where was i....aaah...back to the drawing board, time for another brainstoriming session...tomorrow :D