i played a few days with the mm shared memory lib. i must say i liked it!
it is easy to understand and easy to implement into existing code.
of course the lack of new()/delete() is somewhat limiting, but not that difficult to overcome in most cases.
alternatively, on our project we were looking into boost:interprocess, but i was kinda reluctant to use it, mostly because of the syntax (btw i found a boost/ace alternative at POCO, the users and designers say it provides boost/ace-a-like functionality, even if it cannot compare in perfomance, but the best gain is the C#-oriented syntax; probably worth it to examine).
there was one really weird problem with mm though, when i got the impression that mm_lock() wasnt working.
the workflow was a follows:
we have a hash located on the shared memory segment and before i get() from it i mm_lock it and subsequently mm_unlock() it. still i noticed that many times same value was being written to the hash, which was unexpected by design.
i fired our test up to create a few working processes and breaked in each one before a get(). the locks seemed to work, but after a put() suddenly the lock seemed to vanish in the haze?! digging deeper i found out that in put() we of course try to allocad some new memory in the shared mem segment and traceing down into mm_malloc i found out it was trying to lock too, and when done - calling mm_unlock()! piece of bad design or weird side-effect? either way, apart from that my impressions with OSSP mm were great. however it turned out that the boost guys go a few more tricks up in their sleeves, especially the name persisted storage seemed a very attractive idea for our purposes and we eventually decided to use the boost approach.
so...moving on and up again :)
Friday, October 30, 2009
Monday, October 19, 2009
random performace thoughts
just wanted to lay things out so i can see them more clearly. if it doesn't make sense to you, dear readers, please excuse me :)
so, we have many hashes (maps, containing cached data), but due to our implementation with fork() all the data is lost once a particular task is done. i had the idea of storing all those in a berkeley db so that every new forked task gets it from there and doesn't have to fill the hash by itself, but then - what is a task gonna read from the db when it doesn't know what it is gonna need beforehand. we cannot read everything as the data is HUGE and we cant read bit by bit as this slows down performance a lot - we had a test case like this: we tried using memcached as a remote cache, but this slowed us down even if we read from it only once into a local hash and all next calls where local (quick); if we hadn't read into local hash first, the slowdown would be even more devastating.
one possible solution is for the main process to reread all from the stored data and all children would get it via fork/copy-on-write and then put any changes/additions back into the db for the parent to read when a new task requests start.
i have another one around - when if we realise what we need from the cache much earlier than when we need it? we could then start a thread to pre-load the expected data while the main thread goes on processing and when it reached the point when the hash is needed - voila! its all up there! in case there are gaps they can be filled runtime, should be a small percentage of the overall effort
i'm working on a shared memory solution, but all these got me into doubt about the performance hit, similar to hitting the memcached everytime (by the way, i found out about OSSP mm shared memory allocation project at http://www.ossp.org/pkg/lib/mm/, and it looks really good. i dont know how i missed it earlier, guess i was looking for a std-kind-of allocator and got ptu back by some messages that boost had given up on the idea. another btw - is the new std with C++x0 gonna include something like this? from what i read it does include some fancy new stuff related to multithreaded and multicore programming which is awesome. would be great once we get our hands on it)
so where was i....aaah...back to the drawing board, time for another brainstoriming session...tomorrow :D
so, we have many hashes (maps, containing cached data), but due to our implementation with fork() all the data is lost once a particular task is done. i had the idea of storing all those in a berkeley db so that every new forked task gets it from there and doesn't have to fill the hash by itself, but then - what is a task gonna read from the db when it doesn't know what it is gonna need beforehand. we cannot read everything as the data is HUGE and we cant read bit by bit as this slows down performance a lot - we had a test case like this: we tried using memcached as a remote cache, but this slowed us down even if we read from it only once into a local hash and all next calls where local (quick); if we hadn't read into local hash first, the slowdown would be even more devastating.
one possible solution is for the main process to reread all from the stored data and all children would get it via fork/copy-on-write and then put any changes/additions back into the db for the parent to read when a new task requests start.
i have another one around - when if we realise what we need from the cache much earlier than when we need it? we could then start a thread to pre-load the expected data while the main thread goes on processing and when it reached the point when the hash is needed - voila! its all up there! in case there are gaps they can be filled runtime, should be a small percentage of the overall effort
i'm working on a shared memory solution, but all these got me into doubt about the performance hit, similar to hitting the memcached everytime (by the way, i found out about OSSP mm shared memory allocation project at http://www.ossp.org/pkg/lib/mm/, and it looks really good. i dont know how i missed it earlier, guess i was looking for a std-kind-of allocator and got ptu back by some messages that boost had given up on the idea. another btw - is the new std with C++x0 gonna include something like this? from what i read it does include some fancy new stuff related to multithreaded and multicore programming which is awesome. would be great once we get our hands on it)
so where was i....aaah...back to the drawing board, time for another brainstoriming session...tomorrow :D
Wednesday, February 11, 2009
More on Berkeley DB
alright! time passes on and one gets wiser (or feels sillier and has more questions to ask).
the project i'm working on is written in python with c++ interop throught sip.
at one point we had to move from the solely single-threaded design (who would design a single-threaded SERVER may i ask, duh) to a scalable architecture.
the first obstacle we hit was the mighty GIL of Python (oh my, what didn't i try to overcome this...) - fighting this monster made me an expert in providing solutions for scalability and performance optimizations. or not.
anyway, one of the problems we had was that somehow the data we read from berkeley db would become corrupted seemingly without any reason. dude...what a mess. our system is complicated enough so it took some time till we came to investigate the possibility that bdb was somehow the reason for this.
i studied the solutions for multithreaded and multiprocess access to bdb, added DbEnv, also played a bit with DB_DBT_USERMEM because we thought that simply relying on DB_DBT_MALLOC might lead to mem leaks (i still have to thouringly test and make sure that this is not an issue).
eventually we came to this - whenever you fork and the parent and any children processes share bdb handles - CLOSE AND REOPEN THE DB HANDLES IN THE CHILD PROCESS!
this can save you a couple of hours (or weeks) of hitting your head against the wall, so do this, get your bonus and go happilly home to your family for the day
cheers
to be continued...
the project i'm working on is written in python with c++ interop throught sip.
at one point we had to move from the solely single-threaded design (who would design a single-threaded SERVER may i ask, duh) to a scalable architecture.
the first obstacle we hit was the mighty GIL of Python (oh my, what didn't i try to overcome this...) - fighting this monster made me an expert in providing solutions for scalability and performance optimizations. or not.
anyway, one of the problems we had was that somehow the data we read from berkeley db would become corrupted seemingly without any reason. dude...what a mess. our system is complicated enough so it took some time till we came to investigate the possibility that bdb was somehow the reason for this.
i studied the solutions for multithreaded and multiprocess access to bdb, added DbEnv, also played a bit with DB_DBT_USERMEM because we thought that simply relying on DB_DBT_MALLOC might lead to mem leaks (i still have to thouringly test and make sure that this is not an issue).
eventually we came to this - whenever you fork and the parent and any children processes share bdb handles - CLOSE AND REOPEN THE DB HANDLES IN THE CHILD PROCESS!
this can save you a couple of hours (or weeks) of hitting your head against the wall, so do this, get your bonus and go happilly home to your family for the day
cheers
to be continued...
Subscribe to:
Posts
(
Atom
)