Computing environments become increasingly parallel, and it seems likely that we will see more cores on tomorrow’s desktops and server platforms. In a highly parallel system, tracing garbage collectors may not scale well due to deep heap structures that hinder parallel tracing. Previous work has discovered vulnerabilities within standard Java benchmarks. In this work we examine these standard benchmarks and analyze them to expose the data structures that make current Java benchmarks create deep heap shapes. It turns out that the problem is manifested mostly with benchmarks that employ queues and linked-lists. We then propose a new construction of a lock-free queue data structure with extra references that enables better garbage collector parallelism at a low overhead.