Kunle Olukotun

Cadence Design Systems Professor,
Electrical Engineering & Computer Science,
Stanford University

Publications

Plasticine: A Reconfigurable Architecture For Parallel Patterns
Raghu Prabhakar, Yaqi Zhang, David Koeplinger, Matt Feldman, Tian Zhao, Stefan Hadjis, Ardavan Pedram, Christos Kozyrakis, Kunle Olukotun
ISCA '17: 44th International Symposium on Computer Architecture, June 2017.
Paper PDF

Understanding and Optimizing Asynchronous Low-Precision Stochastic Gradient Descent
Christopher De Sa, Matthew Feldman, Christopher Ré, Kunle Olukotun
ISCA '17: 44th International Symposium on Computer Architecture, June 2017.
Paper PDF

EmptyHeaded: A Relational Engine for Graph Processing
Christopher R. Aberger, Susan Tu, Kunle Olukotun, and Christopher Ré
SIGMOD '16: Special Interest Group on Management of Data, June 2016. (Best Of Award)
Paper PDF | Slides

Ensuring Rapid Mixing and Low Bias for Asynchronous Gibbs Sampling
Christopher De Sa, Kunle Olukotun, and Christopher Ré
ICML '16: Proceedings of the 33rd Intl. Conference on Machine Learning, June 2016. (Best Paper Award)
Paper PDF | Slides | Poster

Automatic Generation of Efficient Accelerators for Reconfigurable Hardware
David Koeplinger, Raghu Prabhakar, Yaqi Zhang, Christina Delimitrou, Christos Kozyrakis, and Kunle Olukotun
ISCA '16: 43rd International Symposium on Computer Architecture, June 2016.
Paper PDF | Slides

Generating Configurable Hardware from Parallel Patterns
Raghu Prabhakar, David Koeplinger, Kevin J. Brown, HyoukJoong Lee, Christopher De Sa, Christos Kozyrakis, and Kunle Olukotun
ASPLOS '16: 21st International Conference on Architectural Support for Programming Languages and Operating Systems, April 2016.
Paper PDF

Have Abstraction and Eat Performance, Too: Optimized Heterogeneous Computing with Parallel Patterns
Kevin J. Brown, HyoukJoong Lee, Tiark Rompf, Arvind K. Sujeeth, Christopher De Sa, Christopher Aberger, and Kunle Olukotun
CGO '16: International Symposium on Code Generation and Optimization, March 2016.
Paper PDF

Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width
Christopher De Sa, Ce Zhang, Christopher Ré, and Kunle Olukotun
NIPS '15: Proceedings of the 28th Neural Information Processing Systems Conference, December 2015.
Paper PDF | Poster

Taming the Wild: A Unified Analysis of Hogwild!-Style Algorithms
Christopher De Sa, Ce Zhang, Christopher Ré, and Kunle Olukotun
NIPS '15: Proceedings of the 28th Neural Information Processing Systems Conference, December 2015.
Paper PDF | Poster

Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems
Christopher De Sa, Kunle Olukotun, and Christopher Ré
ICML '15: Proceedings of the 32nd Intl. Conference on Machine Learning, July 2015.
Paper PDF | Slides | Poster

Locality-Aware Mapping of Nested Parallel Patterns on GPUs
HyoukJoong Lee, Kevin J. Brown, Arvind K. Sujeeth, Tiark Rompf, and Kunle Olukotun
MICRO'14: 47th International Symposium on Microarchitecture, December 2014.
Paper PDF | Slides | Poster

Delite: A Compiler Architecture for Performance-Oriented Embedded Domain-Specific Languages
Arvind K. Sujeeth, Kevin J. Brown, HyoukJoong Lee, Tiark Rompf, Hassan Chafi, Martin Odersky, and Kunle Olukotun
TECS'14: ACM Transactions on Embedded Computing Systems, July 2014.
Paper PDF

Simplifying Scalable Graph Processing with a Domain-Specific Language
Sungpack Hong, Semih Salihoglu, Jennifer Widom, and Kunle Olukotun
CGO'14: International Symposium on Code Generation and Optimization, February 2014.
Paper PDF

Hardware Acceleration of Database Operations
Jared Casper and Kunle Olukotun
FPGA '14: Proceedings of the 2014 ACM/SIGDA international symposium on Field-programmable gate arrays, February 2014.
Paper PDF | Slides

On Fast Parallel Detection of Strongly Connected Components (SCC) in Small-World Graphs
Sungpack Hong, Nicole C. Rodia, and Kunle Olukotun
SC'13: International Conference for High Performance Computing, Networking, Storage, and Analysis, November 2013.
Paper PDF | Slides | Code

Forge: Generating a High Performance DSL Implementation from a Declarative Specification
Arvind K. Sujeeth, Austin Gibbons, Kevin J. Brown, HyoukJoong Lee, Tiark Rompf, Martin Odersky, and Kunle Olukotun
GPCE'13: 12th International Conference on Generative Programming: Concepts & Experiences, October 2013.
Paper PDF

Composition and Reuse with Compiled Domain-Specific Languages
Arvind K. Sujeeth, Tiark Rompf, Kevin J. Brown, HyoukJoong Lee, Hassan Chafi, Victoria Popic, Michael Wu, Aleksander Prokopec, Vojin Jovanovic, Martin Odersky, and Kunle Olukotun
ECOOP'13: European Conference on Object-Oriented Programming, July 2013.
Paper PDF

Optimizing Data Structures in High-Level Programs: New Directions for Extensible Compilers based on Staging
Tiark Rompf, Arvind K. Sujeeth, Nada Amin, Kevin J. Brown, Vojin Jovanovic, HyoukJoong Lee, Manohar Jonnalagedda, Kunle Olukotun, and Martin Odersky
POPL'13: 40th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, January 2013.
Paper PDF | Slides

A Case of System-level Hardware/Software Co-design and Co-verification of a Commodity Multi-Processor System with Custom Hardware
Sungpack Hong, Tayo Oguntebi, Jared Casper, Nathan Bronson, Christos Kozyrakis, and Kunle Olukotun
CODES+ISSS'12: 17th International Conference on Hardware/Software Codesign and System Synthesis, October 2012.
Paper PDF | Slides

Green-Marl: A DSL for Easy and Efficient Graph Analysis
Sungpack Hong, Hassan Chafi, Eric Sedlar, and Kunle Olukotun
ASPLOS '12: 17th International Conference on Architectural Support for Programming Languages and Operating Systems, March 2012.
Paper PDF | Slides

Efficient Parallel Graph Exploration on Multi-Core CPU and GPU
Sungpack Hong, Tayo Oguntebi, and Kunle Olukotun
PACT '11: 20th International Conference on Parallel Architectures and Compilation Techniques, October 2011.
Paper PDF | Slides

A Heterogeneous Parallel Framework for Domain-Specific Languages
Kevin J. Brown, Arvind K. Sujeeth, HyoukJoong Lee, Tiark Rompf, Hassan Chafi, Martin Odersky, and Kunle Olukotun
PACT '11: 20th International Conference on Parallel Architectures and Compilation Techniques, October 2011.
Paper PDF | Slides

Implementing Domain-Specific Languages for Heterogeneous Parallel Computing
HyoukJoong Lee, Kevin J. Brown, Arvind K. Sujeeth, Hassan Chafi, Tiark Rompf, Martin Odersky, and Kunle Olukotun
IEEE Micro: Special Issue on CPU, GPU, and Hybrid Computing, September 2011.
Paper PDF

Building-Blocks for Performance Oriented DSLs
Tiark Rompf, Arvind K. Sujeeth, HyoukJoong Lee, Kevin J. Brown, Hassan Chafi, Martin Odersky, and Kunle Olukotun
DSL '11: IFIP Working Conference on Domain-Specific Languages, September 2011.
Paper PDF | Slides

OptiML: An Implicitly Parallel Domain-Specific Language for Machine Learning
Arvind K. Sujeeth, HyoukJoong Lee, Kevin J. Brown, Tiark Rompf, Hassan Chafi, Michael Wu, Anand R. Atreya, Martin Odersky, and Kunle Olukotun
ICML '11: Proceedings of the 28th Intl. Conference on Machine Learning, June 2011.
Paper PDF | Slides

Hardware Acceleration of Transactional Memory on Commodity Systems
Jared Casper, Tayo Oguntebi, Sungpack Hong, Nathan G. Bronson, Christos Kozyrakis, and Kunle Olukotun
ASPLOS '11: Proceedings of the 16th Intl. Conference on Architectural Support for Programming Languages and Operating Systems, March 2011.
Paper PDF | Slides

Accelerating CUDA Graph Algorithms at Maximum Warp
Sungpack Hong, Sang Kyun Kim, Tayo Oguntebi, and Kunle Olukotun
PPoPP '11: Proceedings of the 16th Annual Symposium on Principles and Practice of Parallel Programming, February 2011.
Paper PDF | Slides

A Domain-Specific Approach to Heterogeneous Parallelism
Hassan Chafi, Arvind K. Sujeeth, Kevin J. Brown, HyoukJoong Lee, Anand R. Atreya, and Kunle Olukotun
PPoPP '11: Proceedings of the 16th Annual Symposium on Principles and Practice of Parallel Programming, February 2011.
Paper PDF | Slides

EigenBench: A Simple Exploration Tool for Orthogonal TM Characterisitics
Sungpack Hong, Tayo Oguntebi, Jared Casper, Nathan Bronson, Christos Koyrakis, and Kunle Olukotun
IISWC '10: Proceedings of the IEEE International Symposium on Workload Characteristics, December 2010. (Best Paper Award)
Paper PDF | Slides

Language Virtualization for Heterogeneous Parallel Computing
Hassan Chafi, Zach DeVito, Adriaan Moors, Tiark Rompf, Arvind K. Sujeeth, Pat Hanrahan, Martin Odersky, and Kunle Olukotun
Onward! '10: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, October 2010.
Paper PDF | Slides

Transactional Predication: High-Performance Concurrent Sets and Maps for STM
Nathan G. Bronson, Jared Casper, Hassan Chafi, and Kunle Olukotun
PODC '10: Proceedings of the 29th Annual ACM Conference on Principles of Distributed Computing, July 2010.
Paper PDF | Slides

Implementing and Evaluating Nested Parallel Transactions in Software Transactional Memory
Woongki Baek, Nathan Bronson, Christos Kozyrakis, and Kunle Olukotun
SPAA '10: Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures, June 2010.
Paper PDF | Slides

Making Nested Parallel Transactions Practical using Lightweight Hardware Support
Woongki Baek, Nathan Bronson, Christos Kozyrakis, and Kunle Olukotun
ICS '10: Proceedings of the 24th Intl. Conference on Supercomputing, June 2010.
Paper PDF | Slides

A Large-scale Architecture for Restricted Boltzmann Machines
Sang Kyun Kim, Peter L. McMahon, and Kunle Olukotun
FCCM '10: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, May 2010.
Paper PDF

FARM: A Prototyping Environment for Tightly-Coupled, Heterogeneous Architectures
Tayo Oguntebi, Sungpack Hong, Jared Casper, Nathan Bronson, Christos Kozyrakis, and Kunle Olukotun
FCCM '10: The 18th Annual International IEEE Symposium on Field-Programmable Custom Computing Machines, May 2010.
Paper PDF | Slides

CCSTM: A Library-Based STM for Scala
Nathan G. Bronson, Hassan Chafi, and Kunle Olukotun
The First Annual Scala Workshop at Scala Days 2010, April 2010.
Paper PDF | Slides

Implementing and Evaluating a Model Checker for Transactional Memory Systems
Woongki Baek, Nathan G. Bronson, Christos Kozyrakis, and Kunle Olukotun
ICECCS '10: Proceedings of the 15th IEEE International Conference on Engineering of Complex Computer Systems, March 2010.
Paper PDF

A Practical Concurrent Binary Search Tree.
Nathan G. Bronson, Jared Casper, Hassan Chafi, and Kunle Olukotun
PPoPP '10: Proceedings of the 15th Annual Symposium on Principles and Practice of Parallel Programming, January 2010.
Paper PDF | Slides

A Highly Scalable Restricted Boltzmann Machine FPGA Implementation
Sang Kyun Kim, Lawrence C. McAfee, Peter L. McMahon, and Kunle Olukotun
FPL '09: Proceedings of the IEEE Conference on Field Programmable Logic and Applications, September 2009.
Paper PDF

Feedback-Directed Barrier Optimization in a Strongly Isolated STM
Nathan G. Bronson, Christos Kozyrakis, and Kunle Olukotun
POPL '09: Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principals of Programming Languages, January 2009.
Paper PDF | Slides

Tradeoffs in Transactional Memory Virtualizations
JaeWoong Chung, Chi Cao Minh, Austen McDonald, Hassan Chafi, Brian D. Carlstrom, Travis Skare, Christos Kozyrakis and Kunle Olukotun
Proceedings of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems, October 2006.
Paper PDF

Architectural Semantics for Practical Transactional Memory
Austen McDonald, JaeWoong Chung, Brian D. Carlstrom, Chi Cao Minh, Hassan Chafi, Christos Kozyrakis, Kunle Olukotun
Proceedings of the 33rd Annual International Symposium on Computer Architecture, Boston, Massachusetts, June 17-21, 2006.
Paper PDF

The Atomos Transactional Programming Language
Brian D. Carlstrom, Austen McDonald, Hassan Chafi, JaeWoong Chung, Chi Cao Minh, Christos Kozyrakis, Kunle Olukotun
Proceedings of the ACM SIGPLAN 2006 Conference on Programming Language Design and Implementation, Ottawa, Canada, June 12, 2006.
Paper PDF

The Software Stack for Transactional Memory: Challenges and Opportunities
Brian D. Carlstrom, JaeWoong Chung, Christos Kozyrakis, Kunle Olukotun
First Workshop on Software Tools for Multi-Core Systems, Manhattan, New York, NY, 26 March 2006.
Paper PDF

The Common Case Transactional Behavior of Multithreaded Programs
JaeWoong Chung, Hassan Chafi, Chi Cao Minh, Austen McDonald, Brian D. Carlstrom, Christos Kozyrakis, and Kunle Olukotun
12th International Symposium on High Performance Computer Architecture (HPCA), Austin, Texas, USA, 11-15 February 2006.
Paper PDF

Transactional Execution of Java Programs
Brian D. Carlstrom, JaeWoong Chung, Hassan Chafi, Austen McDonald, Chi Cao Minh, Lance Hammond, Christos Kozyrakis, and Kunle Olukotun
OOPSLA 2005 Workshop on Synchronization and Concurrency in Object-Oriented Languages (SCOOL), San Diego, California, USA, October 16, 2005.
Paper PDF

Characterization of TCC on Chip-Multiprocessors
Austen McDonald, JaeWoong Chung, Hassan Chafi, Chi Cao Minh, Brian D. Carlstrom, Lance Hammond, Christos Kozyrakis, and Kunle Olukotun
The Fourteenth International Conference on Parallel Architectures and Compilation Techniques, Saint Louis, Missouri, September 19, 2005.
Paper PDF

Maximizing CMP Throughput with Mediocre Cores
John D. Davis, James Laudon., Kunle Olukotun
The Fourteenth International Conference on Parallel Architectures and Compilation Techniques, Saint Louis, Missouri, September 19, 2005.
Paper PDF

The Future of Microprocessors
Kunle Olukotun and Lance Hammond
ACM QUEUE Magazine, September 2005.
Paper PDF

TAPE: A Transactional Application Profiling Environment
Hassan Chafi, Chi Cao Minh, Austen McDonald, Brian D. Carlstrom, JaeWoong Chung, Lance Hammond, Christos Kozyrakis, Kunle Olukotun
The 19th ACM International Conference on Supercomputing, Cambridge, MA, Sunday, June 20, 2005.
Paper PDF

Exposing Speculative Thread Parallelism in SPEC2000
Manohar Prabhu and Kunle Olukotun
Proceedings of the 2005 Principles and Practices of Parallel Programming, Chicago, IL, June 2005.
Paper PDF

Niagara: A 32-Way Multithreaded SPARC Processor
Poonacha Kongetira, Kathirgamar Aingaran, and Kunle Olukotun
IEEE MICRO Magazine, March-April 2005, and presented at Hot Chips 16, August 2004.
Paper PDF

Article about Kunle Olukuton's Niagara processor: Sun's Big Splash
Linda Geppert
IEEE Spectrum Magazine, January 2005.
Paper PDF

Transactional Coherence and Consistency: Simplifying Parallel Hardware and Software
Lance Hammond, Brian D. Carlstrom, Vicky Wong, Michael Chen, Christos Kozyrakis, Kunle Olukotun
Micro's Top Picks, IEEE Micro November/December 2004 (Vol. 24, No. 6).
Paper PDF

Programming with Transactional Coherence and Consistency (TCC)
Lance Hammond, Brian D. Carlstrom, Vicky Wong, Ben Hertzberg, Mike Chen, Christos Kozyrakis, and Kunle Olukotun
Proceedings of the Eleventh International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, Massachusetts, October 9-13, 2004.
Paper PDF

Transactional Memory Coherence and Consistency
Lance Hammond, Vicky Wong, Mike Chen, Ben Hertzberg, Brian D. Carlstrom, John D. Davis, Manohar K. Prabhu, Honggo Wijaya, Christos Kozyrakis, and Kunle Olukotun
Proceedings of the 31st Annual International Symposium on Computer Architecture, München, Germany, June 19-23, 2004.
Paper PDF

The Jrpm System for Dynamically Parallelizing Java Programs
Mike Chen and Kunle Olukotun
Special Issue of IEEE Micro: Micro's Top Picks from Computer Architecture Conferences, Nov./Dec. 2003.
Paper PDF

Using Thread-Level Speculation to Simplify Manual Parallelization
Manohar Prabhu and Kunle Olukotun
Proceedings of the 2003 Principles and Practices of Parallel Programming, San Diego, CA, June 2003.
Paper PDF

The Jrpm System for Dynamically Parallelizing Java Programs
Mike Chen and Kunle Olukotun
Proceedings of the 30th International Symposium on Computer Architecture, San Diego, CA, June 2003.
Paper PDF

TEST: A Tracer for Extracting Speculative Threads
Mike Chen and Kunle Olukotun
The 2003 International Symposium on Code Generation and Optimization, San Francisco, CA, March 2003.
Paper PDF

The Stanford Hydra CMP
Lance Hammond, Ben Hubbert , Michael Siu, Manohar Prabhu , Mike Chen , and Kunle Olukotun
IEEE MICRO Magazine, March-April 2000, and presented at Hot Chips 11, August 1999.
Paper PDF

Improving the Performance of Speculatively Parallel Applications on the Hydra CMP
Kunle Olukotun, Lance Hammond, and Mark Willey
Proceedings of the 1999 ACM International Conference on Supercomputing, Rhodes, Greece, June 1999.
Paper PDF

Exploiting Method-Level Parallelism in Single-Threaded Java Programs
Mike Chen and Kunle Olukotun
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Paris, France, October 1998.
Paper PDF

Data Speculation Support for a Chip Multiprocessor
Lance Hammond, Mark Willey, and Kunle Olukotun
Proceedings of the Eighth ACM Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, California, October 1998.
Paper PDF

Considerations in the Design of Hydra: A Multiprocessor-on-a-Chip Microarchitecture
Lance Hammond, and Kunle Olukotun
Stanford University Computer Systems Lab Technical Report CSL-TR-98-749, February 1998.
Paper PDF

A Single-Chip Multiprocessor
Lance Hammond, Basem A. Nayfeh and Kunle Olukotun
IEEE Computer Special Issue on "Billion-Transistor Processors", September 1997.
Paper PDF

A Single Chip Multiprocessor Integrated with DRAM
Tadaaki Yamauchi, Lance Hammond and Kunle Olukotun
Workshop on Mixing Logic and DRAM preceding the 24th International Symposium on Computer Architecture, June 1997.
Paper PDF

Software and Hardware for Exploiting Speculative Parallelism with a Multiprocessor
Jeffery Oplinger, David Heine, Shih-Wei Liao, Basem A. Nayfeh , Monica Lam and Kunle Olukotun
Stanford University Computer Systems Lab Technical Report CSL-TR-97-715, February 1997.
Paper PDF

The Case for a Single-Chip Multiprocessor
Kunle Olukotun, Basem A. Nayfeh , Lance Hammond, Ken Wilson and Kun-Yung Chang
Proceedings of the Seventh International Symposium on Architectural Support for Parallel Languages and Operating Systems, October 1996.
Paper PDF

Evaluation of Design Alternatives for a Multiprocessor Microprocessor
Basem A. Nayfeh , Lance Hammond and Kunle Olukotun
Proceedings of the 23rd International Symposium on Computer Architecture, May 1996.
Paper PDF

Rationale and Design of the Hydra Multiprocessor
Note: This is the original MCM based design
Kunle Olukotun, Jules Bergmann, Kun-Yung Chang and Basem A. Nayfeh
Stanford University Computer Systems Lab Technical Report CSL-TR-94-645, 1994.
Paper PDF