11 datasets found

h
ami
huggingface.co
opendatalab.com
Updated Oct 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Edingburgh - Centre For Speech Technology Research (2022). ami [Dataset]. https://huggingface.co/datasets/edinburghcstr/ami
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 1, 2022
Dataset authored and provided by
University of Edingburgh - Centre For Speech Technology Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The AMI Meeting Corpus consists of 100 hours of meeting recordings. The recordings use a range of signals synchronized to a common timeline. These include close-talking and far-field microphones, individual and room-view video cameras, and output from a slide projector and an electronic whiteboard. During the meetings, the participants also have unsynchronized pens available to them that record what is written. The meetings were recorded in English using three different rooms with different acoustic properties, and include mostly non-native speakers.
h
ami
huggingface.co
Updated May 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
diarizers-community (2024). ami [Dataset]. https://huggingface.co/datasets/diarizers-community/ami
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 2, 2024
Dataset authored and provided by
diarizers-community
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for the AMI dataset for speaker diarization

The AMI Meeting Corpus consists of 100 hours of meeting recordings. The recordings use a range of signals synchronized to a common timeline. These include close-talking and far-field microphones, individual and room-view video cameras, and output from a slide projector and an electronic whiteboard. During the meetings, the participants also have unsynchronized pens available to them that record what is written. The meetings… See the full description on the dataset page: https://huggingface.co/datasets/diarizers-community/ami.
h
AMIsum
huggingface.co
Updated Jul 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laboratory of Language Technology at Tallinn University of Technology (2023). AMIsum [Dataset]. https://huggingface.co/datasets/TalTechNLP/AMIsum
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 15, 2023
Dataset authored and provided by
Laboratory of Language Technology at Tallinn University of Technology
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for "AMIsum"

Dataset Summary

AMIsum is meeting summaryzation dataset based on the AMI Meeting Corpus (https://groups.inf.ed.ac.uk/ami/corpus/). The dataset utilizes the transcripts as the source data and abstract summaries as the target data.

Supported Tasks and Leaderboards

More Information Needed

Languages

English

Dataset Structure Data Instances

{'transcript': '
P
ICSI Meeting Corpus Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Virgile Rennard; Guokan Shang; Julie Hunter; Michalis Vazirgiannis, ICSI Meeting Corpus Dataset [Dataset]. https://paperswithcode.com/dataset/icsi-meeting-corpus
Explore at:
Authors
Virgile Rennard; Guokan Shang; Julie Hunter; Michalis Vazirgiannis
Description
ICSI Meeting Corpus in JSON format.
h
ami-disfluency
huggingface.co
Updated Apr 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phuoc Hoang Ho (2025). ami-disfluency [Dataset]. https://huggingface.co/datasets/hhoangphuoc/ami-disfluency
Explore at:
Dataset updated
Apr 20, 2025
Authors
Phuoc Hoang Ho
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
DSFL Dataset - AMI Disfluency Laughter Events

This dataset contains segmented audio and video clips from AMI Meeting Corpus, which only consisted of disfluencies and laughter events, segmented in both audio and visual modality. This dataset, along with hhoangphuoc/ami-av is created for my research related to Audio-Visual Speech Recognition, which I currently developed at: https://github.com/hhoangphuoc/AVSL For reproducing the work I've done to create this dataset, checkout the… See the full description on the dataset page: https://huggingface.co/datasets/hhoangphuoc/ami-disfluency.
d
Hydrographic and Impairment Statistics Database: AMIS
datasets.ai
catalog.data.gov
57
Updated Sep 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of the Interior (2024). Hydrographic and Impairment Statistics Database: AMIS [Dataset]. https://datasets.ai/datasets/hydrographic-and-impairment-statistics-database-amis-59e89
Explore at:
57Available download formats
Dataset updated
Sep 11, 2024
Dataset authored and provided by
Department of the Interior
Description
Hydrographic and Impairment Statistics (HIS) is a National Park Service (NPS) Water Resources Division (WRD) project established to track certain goals created in response to the Government Performance and Results Act of 1993 (GPRA). One water resources management goal established by the Department of the Interior under GRPA requires NPS to track the percent of its managed surface waters that are meeting Clean Water Act (CWA) water quality standards. This goal requires an accurate inventory that spatially quantifies the surface water hydrography that each bureau manages and a procedure to determine and track which waterbodies are or are not meeting water quality standards as outlined by Section 303(d) of the CWA. This project helps meet this DOI GRPA goal by inventorying and monitoring in a geographic information system for the NPS: (1) CWA 303(d) quality impaired waters and causes; and (2) hydrographic statistics based on the United States Geological Survey (USGS) National Hydrography Dataset (NHD). Hydrographic and 303(d) impairment statistics were evaluated based on a combination of 1:24,000 (NHD) and finer scale data (frequently provided by state GIS layers).
h
ami-av
huggingface.co
Updated Apr 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phuoc Hoang Ho (2025). ami-av [Dataset]. https://huggingface.co/datasets/hhoangphuoc/ami-av
Explore at:
Dataset updated
Apr 3, 2025
Authors
Phuoc Hoang Ho
Description
Dataset Summary

This is the processed Audio-Visual Dataset from AMI Meeting Corpus. The dataset was segmented into sentence-level audio/video segments based on the individual [meeting_id]-[speaker_id] transcripts. The purpose of this data is for audio-visual speech recognition task (AVSR), particularly for spontaneous conversational speech. General information about dataset: Total #segments: 83,438 (including either audio/video or both) Dataset({ features: ['id', 'meeting_id'… See the full description on the dataset page: https://huggingface.co/datasets/hhoangphuoc/ami-av.
ChannelSet: a composite dataset of diverse acoustic environments
zenodo.org
zip
Updated Jul 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benjamin Skerritt-Davis; Benjamin Skerritt-Davis; Mattson Ogg; Mattson Ogg (2021). ChannelSet: a composite dataset of diverse acoustic environments [Dataset]. http://doi.org/10.5281/zenodo.5117366
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5117366
Dataset updated
Jul 21, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Benjamin Skerritt-Davis; Benjamin Skerritt-Davis; Mattson Ogg; Mattson Ogg
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We introduce ChannelSet, a dataset which provides a launchpad for exploring the extraneous acoustic information typically suppressed or ignored in audio tasks such as automatic speech recognition. We combined components of existing publicly available datasets to encompass broad variability in recording equipment, microphone position, room or surrounding acoustics, event density (i.e., how many audio events are present), and proportion of foreground and background sounds. Source datasets include: the CHiME-3 background dataset, CHiME-5 evaluation dataset, AMI meeting corpus, Freefield1010, and Vystadial2016.

ChannelSet includes 13 classes spanning various acoustic environments: Indoor_Commercial_Bus, Indoor_Commercial_Cafe, Indoor_Domestic, Indoor_Meeting_Room1, Indoor_Meeting_Room2, Indoor_Meeting_Room3, Outdoor_City_Pedestrian, Outdoor_City_Traffic, Outdoor_Nature_Birds, Outdoor_Nature_Water, Outdoor_Nature_Weather, Telephony_CZ, and Telephony_EN. Each sample is between 1 and 10 seconds in duration. Each class contains 100 minutes of audio, for a total of 21.6 hours, split into separate test (20%) and train (80%) partitions.

Download includes scripts, metadata, and instructions for producing ChannelSet from source datasets.
h
ami-dsfl-av
huggingface.co
Updated Apr 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phuoc Hoang Ho (2025). ami-dsfl-av [Dataset]. https://huggingface.co/datasets/hhoangphuoc/ami-dsfl-av
Explore at:
Dataset updated
Apr 20, 2025
Authors
Phuoc Hoang Ho
Description
AMI DisfluencyLaughter Dataset

This dataset contains segmented audio and video clips which extract from AMI Meeting Corpus. The segmented audio/videos created in this dataset are mainly the disfluencies and laughter events, extracted from original recordings. General information about this dataset:

Number of recordings: 35,731 Has audio: True Has video: True Has lip video: True

Dataset({ features: ['id', 'meeting_id', 'speaker_id', 'start_time', 'end_time', 'duration'… See the full description on the dataset page: https://huggingface.co/datasets/hhoangphuoc/ami-dsfl-av.
f
Average AMI scores (higher is better) of 100 independent runs of various...
figshare.com
xls
Updated Dec 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Devavrat Vivek Dabke; Olga Dorabiala (2024). Average AMI scores (higher is better) of 100 independent runs of various community detection methods over a range of synthetic datasets. [Dataset]. http://doi.org/10.1371/journal.pcsy.0000023.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcsy.0000023.t001
Dataset updated
Dec 5, 2024
Dataset provided by
PLOS Complex Systems
Authors
Devavrat Vivek Dabke; Olga Dorabiala
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
STGkM is our method, CC uses dynamic connected components, k-medoids compresses a dynamic graph into a single static one and uses k-medoids, and DCDID is a heuristic method [4]. The best performance is bolded.
u
Advanced Manufacturing Investment Strategy (AMIS) grant recipients -...
data.urbandatacentre.ca
beta.data.urbandatacentre.ca
Updated Oct 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Advanced Manufacturing Investment Strategy (AMIS) grant recipients - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-d7100328-6ecc-4bea-a66e-588b4f034c86
Explore at:
Dataset updated
Oct 1, 2024
Description
The Advanced Manufacturing Investment Strategy focused on manufacturing companies that were investing in leading edge technologies and processes to increase their productivity and competitiveness in Ontario. Projects must have had a minimum total project value of $10 million or create/retain 50 or more high value jobs within 5 years. Ontario's Advanced Manufacturing Investment Strategy is no longer accepting applications, but has been very successful to date in meeting its objectives. This data set contains a list of recipients of Advanced Manufacturing Investment Strategy from 2006 to 2012. This list includes the following details: * funding program * name of company * location * fiscal year contract signed * government loan commitment * total project jobs created and retained as in the contract.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

University of Edingburgh - Centre For Speech Technology Research (2022). ami [Dataset]. https://huggingface.co/datasets/edinburghcstr/ami

ami

AMI

edinburghcstr/ami

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Oct 1, 2022

Dataset authored and provided by

University of Edingburgh - Centre For Speech Technology Research

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The AMI Meeting Corpus consists of 100 hours of meeting recordings. The recordings use a range of signals synchronized to a common timeline. These include close-talking and far-field microphones, individual and room-view video cameras, and output from a slide projector and an electronic whiteboard. During the meetings, the participants also have unsynchronized pens available to them that record what is written. The meetings were recorded in English using three different rooms with different acoustic properties, and include mostly non-native speakers.

Clear search

Close search

Google apps

Main menu

ami

ami

AMIsum

ICSI Meeting Corpus Dataset

ami-disfluency

Hydrographic and Impairment Statistics Database: AMIS

ami-av

ChannelSet: a composite dataset of diverse acoustic environments

ami-dsfl-av

Average AMI scores (higher is better) of 100 independent runs of various...

Advanced Manufacturing Investment Strategy (AMIS) grant recipients -...

amiSee More Versions

AMI

edinburghcstr/ami

ami