The Cell processor is an example of the trade-oﬀs made when designing a mass market power eﬃcient multi-core machine, but the machine-exposing architecture and raw communication mechanisms of Cell are hard to manage for a programmer. Cell's design is simple and causes software complexity to go up in the areas of achieving low threading overhead, good bandwidth eﬃciency, and load balance. Several attempts have been made to produce eﬃcient and eﬀective programming systems for Cell, but the attempts have been too specialized and thus fall short. We present Jack Rabbit, an eﬃcient thread pool work queue implementation, with load balancing mechanisms and double buﬀering. Our system incurs low threading overhead, gets good load balance, and achieves bandwidth eﬃciency. Our system represents a step towards an eﬀective way to program Cell and any similar current or future processors.