Collab Summit 2016 has ended
Back To Schedule
Wednesday, March 30 • 10:30am - 10:55am
GitHub Data and Metrics - Jeff McAffer, Microsoft

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

GitHub hosts millions of people collaborating on more than 20 million repositories. This is an unprecedented treasure trove of data for software engineering researchers, companies and project teams alike:
* Researchers take interest in developer behavior and code evolution – branching, collaboration, bug/fix rates, software quality and distributed software development.
* Companies look for how projects, theirs and others, are doing and discover trends in the industry.
* Project teams want to understand their health, uptake of their offerings, API usage and more.

In this session, we'll give you a first look at GHTorrent/DataLake, an infrastructure for tracking the activity of all (20 million!) public GitHub repos, and their thousands (and thousands) of events per hour. We'll talk about (and show) real insights, in areas from contribution handling with pull requests and issues to api usage, tool adoption and notions of project health. This work is applicable to researchers, developers, community members/managers, product teams and executive sponsors. We also outline how it all works and our plans for making the data widely available.

avatar for Jeff McAffer

Jeff McAffer

Director, Open Source Programs Office, Microsoft
Jeff McAffer is the Director of the Open Source Programs Office at Microsoft where he and the team are helping drive the company’s transition to an “open source engagement first” model. He was one of the founders of the Eclipse open source project where he was an active community... Read More →

Wednesday March 30, 2016 10:30am - 10:55am PDT
Grand Sierra Ballroom A