Easy way to load, create, version, query and visualize computer vision datasets
📰 Hacker News · morpheusme
Learn to efficiently manage computer vision datasets with Hub by Activeloop, a Python package for easy data loading, creation, versioning, querying, and visualization
Action Steps
- Install Hub using pip: `pip install hub`
- Create a new dataset using `hub.empty`
- Load and store datasets in the cloud using `hub.load` and `hub.store`
- Visualize datasets using the Activeloop platform
- Build machine learning pipelines using the Hub API
Who Needs to Know This
Data scientists and machine learning engineers can benefit from using Hub to streamline their workflow and improve collaboration
Key Insight
💡 Hub allows for efficient management of large datasets without requiring local storage
Share This
Streamline your computer vision workflow with Hub by Activeloop! Load, create, version, query, and visualize datasets with ease #machinelearning #computervision
Full Article
Hi HN, In machine learning, we are faced with tensor-based computations (that's the language that ML models think in). I've recently discovered a project that helps you make it much easier to set up and conduct machine learning projects, and enables you to create and store datasets in deep learning-native format. Hub by Activeloop (https://github.com/activeloopai/Hub) is an open-source Python package that arranges data in Numpy-like arrays. It integrates smoothly with deep learning frameworks such as TensorFlow and PyTorch for faster GPU processing and training. In addition, one can update the data stored in the cloud, create machine learning pipelines using Hub API and interact with datasets (e.g. visualize) in Activeloop platform (https://app.activeloop.ai). The real benefit for me is that, I can stream my datasets without the need to store them on my machine (my datasets can be up to 10GB+ big, but it works just as well with 100GB+ datasets li
DeepCamp AI