Facebook
Twitterwandb/ragbench-sentence-relevance-balanced dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterRAGTruth Dataset
Dataset Description
Dataset Summary
The RAGTruth dataset is designed for evaluating hallucinations in text generation models, particularly in retrieval-augmented generation (RAG) contexts. It contains examples of model outputs along with expert annotations indicating whether the outputs contain hallucinations.
Dataset Structure
Each example contains:
A query/question Context passages Model output Hallucination labels (evident… See the full description on the dataset page: https://huggingface.co/datasets/wandb/RAGTruth-processed.
Facebook
TwitterFinQA Dataset (Processed)
Dataset Description
Dataset Summary
The FinQA dataset is designed for numerical reasoning over financial data, containing questions that require complex reasoning over tables and text from financial reports.
Dataset Statistics
Total examples: 8281 Training set size: 6624 examples Test set size: 1657 examples
Dataset Structure
Each example contains:
Required columns: query: The question to be answered (derived… See the full description on the dataset page: https://huggingface.co/datasets/wandb/finqa-data-processed.
Facebook
TwitterImplementations of reinforcement learning algorithms.
Training and benchmarking assumes you have a Weights & Biases project to upload runs to. By default training goes to a rl-algo-impls project while benchmarks go to rl-algo-impls-benchmarks. During training and benchmarking runs, videos of the best models and the model weights are uploaded to WandB.
Before doing anything below, you'll need to create a wandb account and run wandb
login.
Benchmark runs are uploaded to WandB, which can be made into reports ("https://api.wandb.ai/links/sgoodfriend/6p2sjqtn">for example). So far I've found Lambda Labs A10 instances to be a good balance of performance (14 hours to train PPO in 14 environments [5 basic gym, 4 PyBullet, CarRacing-v0, and 4 Atari] across 3 seeds) vs cost ($0.60/hr).
git clone https://github.com/sgoodfriend/rl-algo-impls.git
cd rl-algo-impls
# git checkout BRANCH_NAME if running on non-main branch
bash ./scripts/setup.sh
wandb login
bash ./scripts/benchmark.sh [-a {"ppo"}] [-e ENVS] [-j {6}] [-p {rl-algo-impls-benchmarks}] [-s {"1 2 3"}]
Benchmarking runs are by default upload to a rl-algo-impls-benchmarks project. Runs upload
videos of the running best model and the weights of the best and last model.
Benchmarking runs are tagged with a shorted commit hash (i.e., benchmark_5598ebc) and
hostname (i.e., host_192-9-145-26)
Publishing benchmarks to Huggingface requires logging into Huggingface with a write-capable API token:
git config --global credential.helper store
huggingface-cli login
# For example: python benchmark_publish.py --wandb-tags host_192-9-147-166 benchmark_1d4094f --wandb-report-url https://api.wandb.ai/links/sgoodfriend/099h4lvj
# --virtual-display likely must be specified if running on a remote machine.
python benchmark_publish.py --wandb-tags HOST_TAG COMMIT_TAG --wandb-report-url WANDB_REPORT_URL [--virtual-display]
Hyperparameter tuning can be done with the tuning/tuning.sh script, which runs
multiple processes of optimize.py. Start by doing all the setup meant for training
before running tuning/tuning.sh:
# Setup similar to training above
wandb login
bash scripts/tuning.sh -a ALGO -e ENV -j N_JOBS -s NUM_SEEDS
3 notebooks in the colab directory are setup to be used with Google Colab:
My local development has been on an M1 Mac. These instructions might not be complete, but these are the approximate setup and usage I've been using:
brew install swig
brew install --cask xquartz
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-arm64.sh
sh Miniconda3-latest-MacOSX-arm64.sh
conda env create -f environment.yml -n rl_algo_impls
conda activate rl_algo_impls
poetry install
Training, benchmarking,...
Facebook
TwitterSame as HuggingFaceH4/deita-10k-v0-sft but without non-latin text.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
All models were taken from Huggingface Question Answering Models trained using @rhtsingh's processed dataset [ External Data - MLQA, XQUAD Preprocessing ] using huggingface/transformers inbuilt weights and biases logger.
https://raw.githubusercontent.com/SauravMaheshkar/chaii-Hindi-Tamil-QA/500ff923d44525d25d28a7b299995200b36c76cd/assets/Evaluation%20Loss.svg" alt="">
| Name | Training Loss | Evaluation Loss |
|---|---|---|
| electra-base-squad2 | 1.9823 | 2.27 |
| distilbert-base-cased-distilled-squad | 1.1694 | 1.31 |
| bert-base-cased-squad2 | 1.0992 | 1.26 |
| distilbert-base-uncased-distilled-squad | 1.0642 | 1.19 |
| bert-large-uncased-whole-word-masking-squad2 | 0.9206 | 1.02 |
| bert-large-uncased-whole-word-masking-finetuned-squad | 0.9068 | 1.01 |
| xlm-roberta-base-squad2 | 0.7908 | 0.90 |
| distilbert-**multi**-finetuned-for-xqua-on-tydiqa | 0.7827 | 0.89 |
| bert-**multi**-uncased-finetuned-xquadv1 | 0.7072 | 0.93 |
| bert-**multi**-cased-finetuned-xquadv1 | 0.6517 | 0.74 |
| bert-base-**multilingual**-cased-finetuned-squad | 0.6257 | 0.73 |
| xlm-**multi**-roberta-large-squad2 | 0.6209 | 0.74 |
| bert-**multi**-cased-finedtuned-xquad-tydiqa-goldp | 0.6156 | 0.70 |
| roberta-large-squad2 | 0.2488 | 0.36 |
| roberta-base-squad2 | 0.236 | 0.35 |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The goal of this task is to train a model that can localize and classify each instance of Person and Car as accurately as possible.
from IPython.display import Markdown, display
display(Markdown("../input/Car-Person-v2-Roboflow/README.roboflow.txt"))
In this Notebook, I have processed the images with RoboFlow because in COCO formatted dataset was having different dimensions of image and Also data set was not splitted into different Format. To train a custom YOLOv7 model we need to recognize the objects in the dataset. To do so I have taken the following steps:
Image Credit - jinfagang
!git clone https://github.com/WongKinYiu/yolov7 # Downloading YOLOv7 repository and installing requirements
%cd yolov7
!pip install -qr requirements.txt
!pip install -q roboflow
!wget "https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt"
import os
import glob
import wandb
import torch
from roboflow import Roboflow
from kaggle_secrets import UserSecretsClient
from IPython.display import Image, clear_output, display # to display images
print(f"Setup complete. Using torch {torch._version_} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")
https://camo.githubusercontent.com/dd842f7b0be57140e68b2ab9cb007992acd131c48284eaf6b1aca758bfea358b/68747470733a2f2f692e696d6775722e636f6d2f52557469567a482e706e67">
I will be integrating W&B for visualizations and logging artifacts and comparisons of different models!
try:
user_secrets = UserSecretsClient()
wandb_api_key = user_secrets.get_secret("wandb_api")
wandb.login(key=wandb_api_key)
anonymous = None
except:
wandb.login(anonymous='must')
print('To use your W&B account,
Go to Add-ons -> Secrets and provide your W&B access token. Use the Label name as WANDB.
Get your W&B access token from here: https://wandb.ai/authorize')
wandb.init(project="YOLOvR",name=f"7. YOLOv7-Car-Person-Custom-Run-7")
https://uploads-ssl.webflow.com/5f6bc60e665f54545a1e52a5/615627e5824c9c6195abfda9_computer-vision-cycle.png" alt="">
In order to train our custom model, we need to assemble a dataset of representative images with bounding box annotations around the objects that we want to detect. And we need our dataset to be in YOLOv7 format.
In Roboflow, We can choose between two paths:
https://raw.githubusercontent.com/Owaiskhan9654/Yolo-V7-Custom-Dataset-Train-on-Kaggle/main/Roboflow.PNG" alt="">
user_secrets = UserSecretsClient()
roboflow_api_key = user_secrets.get_secret("roboflow_api")
rf = Roboflow(api_key=roboflow_api_key)
project = rf.workspace("owais-ahmad").project("custom-yolov7-on-kaggle-on-custom-dataset-rakiq")
dataset = project.version(2).download("yolov7")
Here, I am able to pass a number of arguments: - img: define input image size - batch: determine
Facebook
TwitterFAVA Dataset (Processed)
Dataset Description
Dataset Summary
The FAVA (Factual Association and Verification Annotations) dataset is designed for evaluating hallucinations in language model outputs. This processed version contains binary hallucination labels derived from detailed span-level annotations in the original dataset.
Dataset Structure
Each example contains:
Required columns: query: The prompt given to the model context: Empty field (for… See the full description on the dataset page: https://huggingface.co/datasets/wandb/fava-data-processed.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for Openvalidators dataset
Dataset Summary
The OpenValidators dataset, created by the OpenTensor Foundation, is a continuously growing collection of data generated by the OpenValidators project in W&B. It contains hundreds of thousands of records and serves researchers, data scientists, and miners in the Bittensor network. The dataset provides information on network performance, node behaviors, and wandb run details. Researchers can gain insights and detect… See the full description on the dataset page: https://huggingface.co/datasets/pedroferreira/openvalidators-test.
Facebook
Twitterhttps://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/
Dataset Card for "ARTeFACT"
ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage
Here we provide example code for downloading the data, loading it as a PyTorch dataset, splitting by material and/or content, and visualising examples.
Housekeeping
!pip install datasets !pip install -qqqU wandb transformers pytorch-lightning==1.9.2 albumentations torchmetrics torchinfo !pip install -qqq requests gradio
import os from glob import glob
import cv2… See the full description on the dataset page: https://huggingface.co/datasets/danielaivanova/damaged-media.
Facebook
TwitterThis dataset contains sequences of actions, motor angles and pendulum angles as well as velocities for a rotary inverted pendulum robot. The dataset was collected while training the robot to swing up and balance (wandb run).Angles are in radian. Velocities were computed from the angles and fed to the policy. Control frequency is 75Hz.The action maps to the motor voltage with: deadzone = 0.1 center = 0.05 max_act = 0.9 if abs(action) > center: V = np.sign(action) * (… See the full description on the dataset page: https://huggingface.co/datasets/armandpl/real_cartpole_200k.
Facebook
TwitterDataset Card for AutoTrain Evaluator
This repository contains model predictions generated by AutoTrain for the following task and dataset:
Task: Token Classification Model: lewtun/autotrain-acronym-identification-7324788 Dataset: acronym_identification Config: default Split: validation
To run new evaluation jobs, visit Hugging Face's automatic model evaluator.
Contributions
Thanks to @wandb.init(project=PROJECT for evaluating this model.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterwandb/ragbench-sentence-relevance-balanced dataset hosted on Hugging Face and contributed by the HF Datasets community