Scalable Use of the STL

Abstract

STL’s performance has been satisfactory for a large range of applications. However, when using STL in large scale programs that have tight performance requirements, special care needs to be taken that doesn’t apply to smaller data sets. Amortized constant time is not terribly useful if multiplied by a constant that makes your application run forever. Doubling the size of a vector has unexpected effects. STL’s node allocation policy, while useful at small scale, may badly hurt performance. The allocator interface is unable to perform in-place reallocation. Even using realloc for moveable types is suboptimal. Finally, all stock allocators are very coy about releasing back memory to the operating system, which creates odd problems in long-running programs. This talk focuses on concrete tips for using STL (and C++ collections in general) scalably.

Highlights

  • Gives prescriptive advice of managing large std::vectors, with surprising connections to mathematics 
  • Explains how the STL interacts with the C library's and the operating system's memory allocator
  • Explains how to minimize contention in multithreaded applications using the STL
  • Explains the relative advantages and liabilities of std::string versus an in-house string class 

Attendee profile

This is a class aimed at senior engineers and system architects. Understanding of basic STL containers is assumed. It is also assumed that the attendees are using or consider using the STL in applications with large data sets and demanding speed requirements. High-level understanding of classic, lock-based concurrency is necessary. Understanding of basic algebra is a plus.  

Outline

  • Premise
  • Exercise care with std::vector
    • Composition
    • Expansion with push_back
    • Vector expansion basics
    • Shocking discovery
    • Tests
    • What's a good expansion factor?
    • Generalization
    • Specification
  • Understand (and possibly replace) your memory allocator
    • Allocators
    • Windows
    • Linux, FreeBSD
    • STL-related allocator vagaries
    • More STL-related allocator vagaries
    • More STL-related allocator vagaries
  • Minimize contention with STL containers
    • Basic usage
    • Do work on the side, lock, and swap
    • Initial vs. better
    • Push copies into parameters
    • Splendid
    • Use Reader-Writer locks
    • Upgrading/downgrading R-W locks
    • Workaround: double checking
  • Define your own string class
    • std::string: a string of poor decisions
    • The COW is dead, long live eager copying
    • Problems (I)/(II)
    • The stock std::string
    • Your own string
  • Conclusion