philip lelyveld The world of entertainment technology


Data meets science: Open access, code, datasets, and knowledge graphs for machine learning research and beyond

A new interconnected ecosystem for research is shaping up, and machine learning is just the tip of the iceberg.


"To succeed at becoming a data-driven organization, your employees should always use data to start, continue, or conclude every single business decision, no matter how major or minor".

That quote belongs to Ashish Thusoo, author of the DataOps book, founder of Qubole, and one of the people who built the data-driven culture in Facebook as early as 2007.

To make research readily available to as many people as possible as soon as possible, many researchers choose to publish their work on pre-print repositories like Arxiv or Zenodo. Pre-prints solve the open access issues, as they are immediately accessible for free.

Most pre-prints will be revised, in minor or major ways, while others may not be published at all. But even for the ones that do go through the review and publication process successfully, an equally important issue remains: Reproducibility.

According to a 2016 Nature survey, more than 70% of researchers have tried and failed to reproduce another scientist's experiments, and more than half have failed to reproduce their own experiments.


Enter Papers with Code. Papers with Code is another repository for research, with its mission statement citing the creation of a free and open resource with machine learning papers, code, and evaluation tables as its goal. It highlights trending machine learning research and the code to implement it.

Papers with Code was founded by Robert Stojnic and Ross Taylor in 2018. Stojnic and Taylor have joined Facebook AI in 2019. Since then, the team has grown, they have partnered with Arxiv, and expanded to more disciplines.

The latest addition to Papers with Code's arsenal is data. The repository now indexes 3,000+ research datasets from machine learning. Users can now find datasets by task and modality, compare usage over time, and browse benchmarks.

Connected Papers is a free visual tool that helps researchers and applied scientists find and explore papers relevant to their field of work, in any domain. It creates a graph for each paper in its repository, by analyzing about 50,000 papers and selecting the few dozen with the strongest connections to the origin paper.

On Feb. 3, Connected Papers also announced a partnership with Arxiv

See the full story here:

Comments (0) Trackbacks (0)

Sorry, the comment form is closed at this time.

Trackbacks are disabled.