Data Version Control. Beta release.

Data Version Control. Beta release.

Git extension for data scientists: orchestrate code and data

0 followers

Data Version Control is an open source tool to manage your code and data together. One of the biggest challenges in reusing, and hence the managing of ML projects, it its reproducibility. To address the reproducibility we have build Data Version Control or the DVC tool.
Data Version Control. Beta release. gallery image
Data Version Control. Beta release. gallery image
Data Version Control. Beta release. gallery image
Data Version Control. Beta release. gallery image
Launch Team
AssemblyAI
AssemblyAI
Build voice AI apps with a single API
Promoted

What do you think? …

Renuka Apte
Finally version control for machine learning models! I like the idea that beyond versioning the models, it caches the intermediate data files and the commands it took to compute them. Do I have to use S3? P.S The video is adorable!
Ruslan Kuprieiev
@renuka Thanks! No, you don't have to use S3, dvc works perfectly without any cloud setup by simply storing your cache locally. You would want to setup your cloud if you want to share your data files(including intermediate data in your pipeline) with others.