69 datasets found
  1. f

    Unravelling travellers’ route choice behaviour at full-scale urban network...

    • plos.figshare.com
    • figshare.com
    tiff
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug (2023). Unravelling travellers’ route choice behaviour at full-scale urban network by focusing on representative OD pairs in computer experiments [Dataset]. http://doi.org/10.1371/journal.pone.0225069
    Explore at:
    tiffAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In a city-scale network, trips are made in thousands of origin-destination (OD) pairs connected by multiple routes, resulting in a large number of alternatives with diverse characteristics that influence the route choice behaviour of the travellers. As a consequence, to accurately predict user choices at full network scale, a route choice model should be scalable to suit all possible configurations that may be encountered. In this article, a new methodology to obtain such a model is proposed. The main idea is to use clustering analysis to obtain a small set of representative OD pairs and routes that can be investigated in detail through computer route choice experiments to collect observations on travellers behaviour. The results are then scaled-up to all other OD pairs in the network. It was found that 9 OD pair configurations are sufficient to represent the network of Lyon, France, composed of 96,096 OD pairs and 559,423 routes. The observations, collected over these nine representative OD pair configurations, were used to estimate three mixed logit models. The predictive accuracy of the three models was tested against the predictive accuracy of the same models (with the same specification), but estimated over randomly selected OD pair configurations. The obtained results show that the models estimated with the representative OD pairs are superior in predictive accuracy, thus suggesting the scaling-up to the entire network of the choices of the participants over the representative OD pair configurations, and validating the methodology in this study.

  2. ODOT District OD Pairs

    • ohiostatefreightplan-ohiodot.hub.arcgis.com
    Updated Apr 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ohio Department of Transportation (2022). ODOT District OD Pairs [Dataset]. https://ohiostatefreightplan-ohiodot.hub.arcgis.com/datasets/odot-district-od-pairs/about
    Explore at:
    Dataset updated
    Apr 22, 2022
    Dataset authored and provided by
    Ohio Department of Transportationhttps://transportation.ohio.gov/
    Description

    Tabe containing data used to join ODOT Districts to Origin and Destination of freight. This table also includes Origin and Destination ODOT District codes and information detailing if the origin or destination is International or an US state.

  3. Z

    Dataset defining representative route network for GLOWOPT market segments

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Radhakrishnan, Kaushik (2024). Dataset defining representative route network for GLOWOPT market segments [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5110097
    Explore at:
    Dataset updated
    Jul 18, 2024
    Dataset authored and provided by
    Radhakrishnan, Kaushik
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For calculating the GLOWOPT representative route network, a forecast model chain was used. The model was calibrated with 2019 flight movement data (unimpeded by COVID-19) and provided forecasted aircraft movements from the year 2019 (~2020) to 2050 in 5 years intervals.

    Two formats of datasets are generated with the results of the forecast model chain, a csv file format and 4-dimensional array supported with MATLAB (.mat).

    CSV Datasets

    For each forecasted year a csv file is generated with the information on the origin-destination (OD) airports IATA codes, region, latitude and longitude of OD pair, representative aircraft type along with the aircraft category , the average load factor and finally, the distance between the OD pair. The airports worldwide are sub-dived into nine regions namely Africa, Asia, Caribbean, Central America, Europe, Middle East, North America, Oceania and South America. There are total of seven datasets, one for each forecasted year i.e. for years 2019 (~2020), 2025, 2030, 2035, 2040, 2045 and 2050.

    Description of the data labels:

    Origin- Origin airport IATA code

    Origin_Region- Region of the Origin Airport

    Origin_Latitude- Latitude of the Origin Airport

    Origin_Longitude- Longitude of the Origin Airport

    Destination- Destination airport IATA code

    Destination_Region- Region of the Destination Airport

    Destination_Latitude- Latitude of the Destination Airport

    Destination_Longitude- Longitude of the Destination Airport

    AcType- Representative aircraft type

    Load_Factor- Average load factor per flight

    Yearly_Frequency- Total aircraft movements per annum

    RefACType- Aircraft Category based on number of seats (Category 6 represents aircraft with seats 252-301 and category 7 represents aircraft with seats greater than 302.)

    Distance- Great circle distance between Origin and Destination in Km.

    MATLAB Datasets

    The dataset generated with MATLAB is a 4-dimensional array with the extension *.mat. The first dimension is the region of the origin airport and subsequently the second dimensions contains the region of the destination airport. The third and fourth dimension are the aircraft category based on seat numbers and the categorized great circle distances. The information received therein is a 1X1 cell with the IATA codes of the OD pairs, frequency and great circle distance in Km.

    The 4D array is categorised such that the user can select the route segment specific to a region or a combination of regions. The range categorisation in combination with an aircraft category additionally offers the user the possibility to select routes depending on their great circle distances. The ranges are categorised to represent very short range (0-2000 km), short range (2000-6000 km), medium range (6000-10000 km) and long range (10000 – 15000 km).

    Indexing based on the categorisation of the 4D array dataset - Refer to file 'Indexing_MAT_Dataset.PNG'

    For example:

    To derive the OD pairs and yearly frequency of aircraft movements for routes which originate from Europe and are destined to Asia, operated with category 6 aircraft type and are separated by distances between 10,000 to 15,000 km:

    In MATLAB (Indexing based on file 'Indexing_MAT_Dataset.PNG' ):

    Route_Network (5,2,1,4),

    Description on Index:

    5 – Europe: Origin Region

    2 – Asia: Destination Region

    1– Category 6: Aircraft Type

    4 – 10000-15000 km: Range

  4. m

    MBTA Rapid Transit Travel Times 2021

    • gis.data.mass.gov
    • arc-gis-hub-home-arcgishub.hub.arcgis.com
    • +3more
    Updated May 3, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massachusetts geoDOT (2021). MBTA Rapid Transit Travel Times 2021 [Dataset]. https://gis.data.mass.gov/datasets/4892695c4baf42419cbe19bd4995bc3a
    Explore at:
    Dataset updated
    May 3, 2021
    Dataset authored and provided by
    Massachusetts geoDOT
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Description

    It includes all heavy rail and light rail rapid transit lines. These travel times are calculated from the departure time at the origin stop to the arrival time at the destination stop. Due to track circuit or other data issues, data is not guaranteed to be complete for any origin-destination pair or date.Data Dictionary:NameDescriptionData TypeExampleservice_dateDate for which travel times should be returned.Date43830route_idGTFS-compatible route for which travel times should be returned.StringOrangedirection_idGTFS-compatible direction for which travel times should be returned.Integer0from_stop_idGTFS-compatible stop representing the origin stop in a pair.String70205to_stop_idGTFS-compatible stop representing the destination stop in a pair.String70154start_time_secProperty of “Travel Times”. Expressed in "seconds after midnight." The time associated with the departure event of the vehicle from the origin stop of the pair.Integer45763end_time_secProperty of “Travel Times”. Expressed in "seconds after midnight." The time associated with the arrival event of the vehicle to the destination stop of the pair.Integer46411travel_time_secProperty of “Travel Times”. Difference between start_time_sec and end_time_sec. The actual travel time between the origin stop and the destination stop, in seconds.Integer648MassDOT/MBTA shall not be held liable for any errors in this data. This includes errors of omission, commission, errors concerning the content of the data, and relative and positional accuracy of the data. This data cannot be construed to be a legal document. Primary sources from which this data was compiled must be consulted for verification of information contained in this data.

  5. MXL models estimation results.

    • plos.figshare.com
    xls
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug (2023). MXL models estimation results. [Dataset]. http://doi.org/10.1371/journal.pone.0225069.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MXL models estimation results.

  6. Image-audio pairs (3 of 3)

    • kaggle.com
    zip
    Updated Jun 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorvan (2024). Image-audio pairs (3 of 3) [Dataset]. https://www.kaggle.com/datasets/jorvan/image-audio-pairs-3-of-3
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Jun 28, 2024
    Authors
    Jorvan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This is the third of the three datasets that we have created, for audio-image training tasks. These collect pairs of images and audios, extracted from the same intervals of time in videos from the public datasets MUSIC dataset [1, 2], AudioSetZSL [3, 4] and SoundNet [5, 6]. As AudioSetZSL is only intended for research purposes, so are our datasets. The images are 512x512, .png and with color format RGB24. On the other hand, the audios are 1 s long, monophonic, .wav, and have 16 kHz with 16 bit depth. A small slip made us omit naming the first .zip file as 1.zip. However, it doesn't really end up having a meaningful impact, and our data is still perfectly usable and it's worth mentioning that is also pretty varied. In addition, you can easily distinguish the pairs by their numeric names (that were given in a rising order).

    [1] Hang Zhao and Andrew Rouditchenko. MUSIC Dataset from Sound of Pixels. 2018. url: https://github.com/roudimit/MUSIC_dataset. [2] Hang Zhao et al. “The Sound of Pixels”. In: Proceedings of the 15th European Conference on Computer Vision. 2018, pages. 587-604. [3] Kranti Kumar Parida. AudioSetZSL. 2019. url: https://github.com/krantiparida/AudioSetZSL. [4] Kranti Kumar Parida et al. “Coordinated Joint Multimodal Embeddings for Generalized Audio-Visual Zero-shot Classification and Retrieval of Videos”. In: Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision. 2020, pages. 3240-3249. [5] Yusuf Aytar, Carl Vondrick and Antonio Torralba. “SoundNet: Learning Sound Representations from Unlabeled Video”. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, pages. 892-900. [6] Bart Thomee et al. “YFCC100M: the new data in multimedia research”. In: Communications of the ACM 59.2 (2016), pages. 64-73.

  7. Air passenger origin and destination, transborder portions of international...

    • www150.statcan.gc.ca
    • beta.data.urbandatacentre.ca
    • +2more
    Updated Jan 17, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2020). Air passenger origin and destination, transborder portions of international journeys, traffic volumes ranked by city-pair, exceeding 400 outbound plus inbound passengers, annual [Dataset]. http://doi.org/10.25318/2310025501-eng
    Explore at:
    Dataset updated
    Jan 17, 2020
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Air passenger origin and destination data (passenger numbers, city rank), for transborder portions of international journeys, by total outbound and inbound passengers exceeding 400, by city-pair, annual.

  8. d

    MTA Subway Origin-Destination Ridership Estimate: Beginning 2025

    • catalog.data.gov
    Updated Jun 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2025). MTA Subway Origin-Destination Ridership Estimate: Beginning 2025 [Dataset]. https://catalog.data.gov/dataset/mta-subway-origin-destination-ridership-estimate-beginning-2025
    Explore at:
    Dataset updated
    Jun 21, 2025
    Dataset provided by
    data.ny.gov
    Description

    This dataset provides an estimate of subway travel patterns based on scaled-up OMNY and MetroCard return swipe data. It provides estimated passenger volumes for all populated origin-destination (O-D) pairs aggregated by month, day of the week, and hour of day. It also provides the name, ID, and approximate latitude and longitude of the origin and destination subway complexes.

  9. f

    Hyperparameters of the prior distribution h.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug (2023). Hyperparameters of the prior distribution h. [Dataset]. http://doi.org/10.1371/journal.pone.0225069.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Hyperparameters of the prior distribution h.

  10. PP-ind: A Repository of Industrial Pair Programming Research Data

    • zenodo.org
    txt
    Updated Feb 15, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Franz Zieris; Franz Zieris; Lutz Prechelt; Lutz Prechelt (2021). PP-ind: A Repository of Industrial Pair Programming Research Data [Dataset]. http://doi.org/10.5281/zenodo.4529143
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 15, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Franz Zieris; Franz Zieris; Lutz Prechelt; Lutz Prechelt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PP-ind is a repository of research data on industrial pair programming sessions. Since 2007, our research group has collected audio-video-recordings and questionnaire data in 13 companies. A total of 57 developers worked together (mostly in groups of two, but also three or four) in 67 sessions with a mean length of 1:35 hours. A separate tech report provides many details on how this data was collected.

    While we cannot share the original video recordings due to confidentiality agreements, we do provide transcripts of the pairs' dialog in this data set. Note that we transcribe our data on an is-needed basis. Early versions of this data set will therefore contain only few and partial transcripts which will be amended over time.

    Files named "session--transcript.txt" contain original quotations in the language spoken by the recorded developers. For non-English sessions, we also provide non-authoritative "session--transcript_translated.txt" files (following the same is-needed rule for translating the originals). All our analyses, however, are performed on the raw data as reflected in the original transcripts. See file "transcription-notation.txt" for details on the special notation we use.

  11. T

    Quarterly Refrigerated Truck Rates by Origin-Destination Pair

    • agtransport.usda.gov
    application/rdfxml +5
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AMS Specialty Crops (2025). Quarterly Refrigerated Truck Rates by Origin-Destination Pair [Dataset]. https://agtransport.usda.gov/w/qm5q-5r5f/default?cur=LnrwPy1lwQF
    Explore at:
    json, application/rssxml, application/rdfxml, csv, xml, tsvAvailable download formats
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    AMS Specialty Crops
    Description

    Quarterly average refrigerated truck rates ($/mile, $/truckload) from origin shipping areas to destination receiving city.

  12. h

    pair-preference-dataset-700K_subset-7-out-of-8_gemma-2b-it_1-of-8

    • huggingface.co
    Updated Sep 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    reward modeling (2024). pair-preference-dataset-700K_subset-7-out-of-8_gemma-2b-it_1-of-8 [Dataset]. https://huggingface.co/datasets/cornfieldrm/pair-preference-dataset-700K_subset-7-out-of-8_gemma-2b-it_1-of-8
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 13, 2024
    Dataset authored and provided by
    reward modeling
    Description

    cornfieldrm/pair-preference-dataset-700K_subset-7-out-of-8_gemma-2b-it_1-of-8 dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. u

    Maltese crowS-pairs dataset

    • drum.um.edu.mt
    application/csv
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CLAUDIA BORG; Marthese Borg (2024). Maltese crowS-pairs dataset [Dataset]. http://doi.org/10.60809/drum.26056957.v1
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    University of Malta
    Authors
    CLAUDIA BORG; Marthese Borg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Warning: This dataset contains explicit statements of offensive stereotypes which may be upsetting.The study of bias, fairness and social impact in Natural Language Processing (NLP) lacks resources in languages other than English. Our objective is to support the evaluation of bias in language models in a multilingual setting. We use stereotypes across nine types of biases to build a corpus containing contrasting sentence pairs, one sentence that presents a stereotype concerning an underadvantaged group and another minimally changed sentence, concerning a matching advantaged group.In total, we produced 11,139 new sentence pairs that cover stereotypes dealing with nine types of biases in seven cultural contexts. We use the final resource for the evaluation of relevant monolingual and multilingual masked language models.This file contains the sentence pairs localised to the Maltese context in the Maltese language.Other languages are available here: https://gitlab.inria.fr/corpus4ethics/multilingualcrowspairsThe paper describing this work is available here: https://www.um.edu.mt/library/oar/handle/123456789/121722https://aclanthology.org/2024.lrec-main.1545/To use this dataset, please use the following citation:Karen Fort, Laura Alonso Alemany, Luciana Benotti, Julien Bezançon, Claudia Borg, Marthese Borg, Yongjian Chen, Fanny Ducel, Yoann Dupont, Guido Ivetta, Zhijian Li, Margot Mieskes, Marco Naguib, Yuyan Qian, Matteo Radaelli, Wolfgang S. Schmeisser-Nieto, Emma Raimundo Schulz, Thiziri Saci, Sarah Saidi, et al.. 2024. Your Stereotypical Mileage May Vary: Practical Challenges of Evaluating Biases in Multiple Languages and Cultural Contexts. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 17764–17769, Torino, Italia. ELRA and ICCL.

  14. Digital image analysis tools for pairs of filaments in embedded 3D printing

    • catalog.data.gov
    • data.nist.gov
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). Digital image analysis tools for pairs of filaments in embedded 3D printing [Dataset]. https://catalog.data.gov/dataset/digital-image-analysis-tools-for-pairs-of-filaments-in-embedded-3d-printing
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    In embedded 3D printing, a nozzle is embedded into a support bath and extrudes filaments or droplets into the bath. This repository includes Python code for analyzing and managing images and videos of the printing process during extrusion of single filaments, pairs of filaments, and triplets of filaments. The link to the GitHub release goes to the state of the code when the paper was submitted. From there, you can also access the current state of the code.

  15. 23 Pairs of Identical Twins Face Image Data

    • m.nexdata.ai
    Updated Sep 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). 23 Pairs of Identical Twins Face Image Data [Dataset]. https://m.nexdata.ai/datasets/computervision/1007?source=Github
    Explore at:
    Dataset updated
    Sep 18, 2023
    Dataset authored and provided by
    Nexdata
    Variables measured
    Device, Accuracy, Data size, Data format, Data diversity, Race distribution, Collection environment
    Description

    23 Pairs of Identical Twins Face Image Data. The collecting scenes includes indoor and outdoor scenes. The data diversity includes multiple face angles, multiple face postures, close-up of eyes, multiple light conditions and multiple age groups. This dataset can be used for tasks such as twins' face recognition.

  16. Z

    Sentence/Table Pair Data from Wikipedia for Pre-training with...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cong Yu (2021). Sentence/Table Pair Data from Wikipedia for Pre-training with Distant-Supervision [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5612315
    Explore at:
    Dataset updated
    Oct 29, 2021
    Dataset provided by
    Xiang Deng
    Huan Sun
    Yu Su
    Cong Yu
    Alyssa Lees
    You Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the dataset used for pre-training in "ReasonBERT: Pre-trained to Reason with Distant Supervision", EMNLP'21.

    There are two files:

    sentence_pairs_for_pretrain_no_tokenization.tar.gz -> Contain only sentences as evidence, Text-only

    table_pairs_for_pretrain_no_tokenization.tar.gz -> At least one piece of evidence is a table, Hybrid

    The data is chunked into multiple tar files for easy loading. We use WebDataset, a PyTorch Dataset (IterableDataset) implementation providing efficient sequential/streaming data access.

    For pre-training code, or if you have any questions, please check our GitHub repo https://github.com/sunlab-osu/ReasonBERT

    Below is a sample code snippet to load the data

    import webdataset as wds

    path to the uncompressed files, should be a directory with a set of tar files

    url = './sentence_multi_pairs_for_pretrain_no_tokenization/{000000...000763}.tar' dataset = ( wds.Dataset(url) .shuffle(1000) # cache 1000 samples and shuffle .decode() .to_tuple("json") .batched(20) # group every 20 examples into a batch )

    Please see the documentation for WebDataset for more details about how to use it as dataloader for Pytorch

    You can also iterate through all examples and dump them with your preferred data format

    Below we show how the data is organized with two examples.

    Text-only

    {'s1_text': 'Sils is a municipality in the comarca of Selva, in Catalonia, Spain.', # query sentence 's1_all_links': { 'Sils,_Girona': [[0, 4]], 'municipality': [[10, 22]], 'Comarques_of_Catalonia': [[30, 37]], 'Selva': [[41, 46]], 'Catalonia': [[51, 60]] }, # list of entities and their mentions in the sentence (start, end location) 'pairs': [ # other sentences that share common entity pair with the query, group by shared entity pairs { 'pair': ['Comarques_of_Catalonia', 'Selva'], # the common entity pair 's1_pair_locs': [[[30, 37]], [[41, 46]]], # mention of the entity pair in the query 's2s': [ # list of other sentences that contain the common entity pair, or evidence { 'md5': '2777e32bddd6ec414f0bc7a0b7fea331', 'text': 'Selva is a coastal comarque (county) in Catalonia, Spain, located between the mountain range known as the Serralada Transversal or Puigsacalm and the Costa Brava (part of the Mediterranean coast). Unusually, it is divided between the provinces of Girona and Barcelona, with Fogars de la Selva being part of Barcelona province and all other municipalities falling inside Girona province. Also unusually, its capital, Santa Coloma de Farners, is no longer among its larger municipalities, with the coastal towns of Blanes and Lloret de Mar having far surpassed it in size.', 's_loc': [0, 27], # in addition to the sentence containing the common entity pair, we also keep its surrounding context. 's_loc' is the start/end location of the actual evidence sentence 'pair_locs': [ # mentions of the entity pair in the evidence [[19, 27]], # mentions of entity 1 [[0, 5], [288, 293]] # mentions of entity 2 ], 'all_links': { 'Selva': [[0, 5], [288, 293]], 'Comarques_of_Catalonia': [[19, 27]], 'Catalonia': [[40, 49]] } } ,...] # there are multiple evidence sentences }, ,...] # there are multiple entity pairs in the query }

    Hybrid

    {'s1_text': 'The 2006 Major League Baseball All-Star Game was the 77th playing of the midseason exhibition baseball game between the all-stars of the American League (AL) and National League (NL), the two leagues comprising Major League Baseball.', 's1_all_links': {...}, # same as text-only 'sentence_pairs': [{'pair': ..., 's1_pair_locs': ..., 's2s': [...]}], # same as text-only 'table_pairs': [ 'tid': 'Major_League_Baseball-1', 'text':[ ['World Series Records', 'World Series Records', ...], ['Team', 'Number of Series won', ...], ['St. Louis Cardinals (NL)', '11', ...], ...] # table content, list of rows 'index':[ [[0, 0], [0, 1], ...], [[1, 0], [1, 1], ...], ...] # index of each cell [row_id, col_id]. we keep only a table snippet, but the index here is from the original table. 'value_ranks':[ [0, 0, ...], [0, 0, ...], [0, 10, ...], ...] # if the cell contain numeric value/date, this is its rank ordered from small to large, follow TAPAS 'value_inv_ranks': [], # inverse rank 'all_links':{ 'St._Louis_Cardinals': { '2': [ [[2, 0], [0, 19]], # [[row_id, col_id], [start, end]] ] # list of mentions in the second row, the key is row_id }, 'CARDINAL:11': {'2': [[[2, 1], [0, 2]]], '8': [[[8, 3], [0, 2]]]}, } 'name': '', # table name, if exists 'pairs': { 'pair': ['American_League', 'National_League'], 's1_pair_locs': [[[137, 152]], [[162, 177]]], # mention in the query 'table_pair_locs': { '17': [ # mention of entity pair in row 17 [ [[17, 0], [3, 18]], [[17, 1], [3, 18]], [[17, 2], [3, 18]], [[17, 3], [3, 18]] ], # mention of the first entity [ [[17, 0], [21, 36]], [[17, 1], [21, 36]], ] # mention of the second entity ] } } ] }

  17. NIST Fingerprint Image Registration Library (NFRL). Registers a pair of...

    • catalog.data.gov
    • data.nist.gov
    Updated Feb 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). NIST Fingerprint Image Registration Library (NFRL). Registers a pair of fingerprint images using two pairs of control-points (pixel locations). Registration is rigid; translation and rotation are performed without scaling. [Dataset]. https://catalog.data.gov/dataset/nist-fingerprint-image-registration-library-nfrl-registers-a-pair-of-fingerprint-images-us
    Explore at:
    Dataset updated
    Feb 23, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    NFRL registers two fingerprint images based on a pair of corresponding control-points. It uses this control-pointpair of pixel locations within the images to translate and rotate the Moving image to the Fixed image.Runtime configuration parameters include: moving and fixed image data in 8-bits-per-pixel grayscale (preferable but not required) two-pairs of corresponding control points (pixel coordinates).The fingerprint-image rigid-registration process is performed in two steps:1. Translation of the Moving image to the Fixed image using the "first" pair of control-points (the unconstrained pair)2. Rotation of the Moving image around the Fixed image control-point (the translation "target" location) based on theangle-difference determined by the "second" pair of control-points (the constrained pair).Both final images, a few interim images, and registration metadata generated during the registration processare made available to the using software: Final registered Moving image Final registered Fixed image Registered, padded, overlaid image (colorized) Registered, padded, Moving image (grayscale) Padded, Fixed image (grayscale) Summed, registered, dilated overlaid image (the "blob")* Process metadata available in both text and XML format.The two Final images are registered. They are cropped to the region-of-interest that is the smallest area of "overlap"per the registration. Therefore, these two images have identical width and height which enables analysis using metricslike PSNR (Peak Signal to Noise Ratio).

  18. Number of pair of shoes exported from Italy 2019, by country of destination

    • ai-chatbox.pro
    • statista.com
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Number of pair of shoes exported from Italy 2019, by country of destination [Dataset]. https://www.ai-chatbox.pro/?_=%2Fstatistics%2F601676%2Fexport-volume-of-italian-shoe-industry-by-country-of-destination%2F%23XgboD02vawLKoDs%2BT%2BQLIV8B6B4Q9itA
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2019 - Oct 2019
    Area covered
    Italy
    Description

    This statistics shows the number of pair of shoes exported by the Italian shoes industry in the first ten months of 2019, by destination country. As of the survey period, the number of pairs exported to France amounted to roughly 30.9 million pairs. Germany was the second main destination market in terms of volume, followed by Switzerland.

  19. Paired Associates Learning: Memory for Word Pairs in Cued Recall

    • openneuro.org
    Updated Apr 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haydn G. Herrema; Michael J. Kahana (2024). Paired Associates Learning: Memory for Word Pairs in Cued Recall [Dataset]. http://doi.org/10.18112/openneuro.ds005059.v1.0.1
    Explore at:
    Dataset updated
    Apr 4, 2024
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Haydn G. Herrema; Michael J. Kahana
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Paired Associates Learning of Word Pairs

    Description

    This dataset contains behavioral events and intracranial electrophysiological recordings from a paired associates memory task. The experiment consists of participants studying pairs of visually presented words, solving simple arithmetic problems that function as a distractor, and then completing a cued recall task. The data was collected at clinical sites across the country as part of a collaboration with the Computational Memory Lab at the University of Pennsylvania.

    Each session contains 25 lists of the structure: encoding, distractor, cued recall. During encoding, 6 pairs of words are presented one pair at a time. Each pair remains on screen for 4000 ms and is followed by a 1000 ms interstimulus interval. During the cued recall, one randomly chosen word from each pair is shown, and the participant is asked to vocally recall the other word from the pair. Participants have 5000 ms for each recall, and then the next cue (i.e., a word from another pair) is shown. All 6 pairs of words are tested on each list.

    To Note:

    • The iEEG recordings are labeled either "monopolar" or "bipolar." The monopolar recordings are referenced (typically a mastoid reference), but should always be re-referenced before analysis. The bipolar recordings are referenced according to a paired scheme indicated by the accompanying bipolar channels tables.
    • Each subject has a unique montage of electrode locations. MNI and Talairach coordinates are provided when available, along with brain region annotations.
    • Recordings were made on multiple different systems, so we have done the scaling to provide all voltage values in V.
  20. u

    Data from: Demographic mechanisms of inbreeding adjustment through...

    • open.library.ubc.ca
    • borealisdata.ca
    Updated May 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reid, Jane M.; Duthie, A. Bradley; Wolak, Matthew E.; Arcese, Peter (2021). Data from: Demographic mechanisms of inbreeding adjustment through extra-pair reproduction [Dataset]. http://doi.org/10.14288/1.0397870
    Explore at:
    Dataset updated
    May 19, 2021
    Authors
    Reid, Jane M.; Duthie, A. Bradley; Wolak, Matthew E.; Arcese, Peter
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jun 24, 2020
    Area covered
    Canada, British Columbia
    Description

    Usage notes

    PairsData_Dryad

    This file contains data required for the basic descriptive analysis of social pairing in relation to kinship.

    NewMalesPairings_Dryad

    This file contains data required for the analysis of social pairing in relation to kinship between females and the 'new males' set of potential mates.

    AllMalesPairings_Dryad

    This file contains data required for the analysis of social pairing in relation to kinship between females and the 'all males' set of potential mates.

    PairPersistence_Dryad

    This file contains data required for the analysis of social pair persistence in relation to kinship.

    ChangeMeanK_Dryad

    This file contains data required for the analysis of changing mean kinship within the duration of females' social pairings.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug (2023). Unravelling travellers’ route choice behaviour at full-scale urban network by focusing on representative OD pairs in computer experiments [Dataset]. http://doi.org/10.1371/journal.pone.0225069

Unravelling travellers’ route choice behaviour at full-scale urban network by focusing on representative OD pairs in computer experiments

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
tiffAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Humberto González Ramírez; Ludovic Leclercq; Nicolas Chiabaut; Cécile Becarie; Jean Krug
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

In a city-scale network, trips are made in thousands of origin-destination (OD) pairs connected by multiple routes, resulting in a large number of alternatives with diverse characteristics that influence the route choice behaviour of the travellers. As a consequence, to accurately predict user choices at full network scale, a route choice model should be scalable to suit all possible configurations that may be encountered. In this article, a new methodology to obtain such a model is proposed. The main idea is to use clustering analysis to obtain a small set of representative OD pairs and routes that can be investigated in detail through computer route choice experiments to collect observations on travellers behaviour. The results are then scaled-up to all other OD pairs in the network. It was found that 9 OD pair configurations are sufficient to represent the network of Lyon, France, composed of 96,096 OD pairs and 559,423 routes. The observations, collected over these nine representative OD pair configurations, were used to estimate three mixed logit models. The predictive accuracy of the three models was tested against the predictive accuracy of the same models (with the same specification), but estimated over randomly selected OD pair configurations. The obtained results show that the models estimated with the representative OD pairs are superior in predictive accuracy, thus suggesting the scaling-up to the entire network of the choices of the participants over the representative OD pair configurations, and validating the methodology in this study.

Search
Clear search
Close search
Google apps
Main menu