Envestnet®| Yodlee®'s Consumer Purchase Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.
Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.
We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.
Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?
Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.
Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking
Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)
Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence
Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis
This file provides summary or aggregated measures for the 82 societies participating in the first four waves of the World Value Surveys. Thus, the society, rather than the individuals surveyed, are the unit of analysis.
"The World Values Survey is a worldwide investigation of sociocultural and political change. It is conducted by a network of social scientists at leading universities all around world.
Interviews have been carried out with nationally representative samples of the publics of more than 80 societies on all six inhabited continents. A total of four waves have been carried out since 1981 making it possible to carry out reliable global cross-cultural analyses and analysis of changes over time. The World Values Survey has produced evidence of gradual but pervasive changes in what people want out of life. Moreover, the survey shows that the basic direction of these changes is, to some extent, predictable.
This project is being carried out by an international network of social scientists, with local funding for each survey (though in some cases, it has been possible to raise supplementary funds from outside sources). In exchange for providing the data from interviews with a representative national sample of at least 1,000 people in their own society, each participating group gets immediate access to the data from all of the other participating societies. Thus, they are able to compare the basic values and beliefs of the people of their own society with those of more than 60 other societies. In addition, they are invited to international meetings at which they can compare findings and interpretations with other members of the WVS network."
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The record is aimed at helping the reporting countries to submit the 2024 sample-based level data to the EFSA Data Collection Framework. We include here two excel files and one XML file, word documentwe and we provide below specific information on their use.
The two Excel documents help in mapping terms from the matrix catalogue ZOO_CAT_MATRIX used in the aggregated prevalence data model to FoodEx2 codes, and offer examples on how prevalence data can be reported using SSD2 and how data are aggregated afterwards. The XML file is the same example as in the Excel file with similar title but in the XML format that allows for it be uploaded in the Data Collection Framework.
The word document contains the explanation of the examples privided in the Excel and XML and how the aggregation of data reported at sample-based level performed.
At Echo, our dedication to data curation is unmatched; we focus on providing our clients with an in-depth picture of a physical location based on activity in and around a point of interest over time. Our dataset empowers you to explore the “what” by allowing you to dig deeper into customer movement behaviors, eliminate gaps in your trade area and discover untapped potential. Leverage Echo's Activity datasets to identify new growth opportunities and gain a competitive advantage.
This sample of our Area Activity data provides you insights into the estimated total unique visitors and visits in an area. This helps you understand frequentation dynamics over time, identify emerging trends in people movements and measure the impact of external factors on how people move across a city.
Additional Information: - Understand the actual movement patterns of consumers without using PII data, gaining a 360-degree consumer view. Complement your online behavior knowledge with actual offline actions, and better attribute intent based on real-world behaviors. - Echo collects, cleans and updates its footfall on a daily basis. Normalization of the data occurs on a monthly basis. - We provide data aggregation on a weekly, monthly and quarterly basis. - Information about our country offering and data schema can be found here:
1) Data Schema: https://docs.echo-analytics.com/activity/data-schema
2) Country Availability: https://docs.echo-analytics.com/activity/country-coverage
3) Methodology: https://docs.echo-analytics.com/activity/methodology
Echo's commitment to customer service is evident in our exceptional data quality and dedicated team, providing 360° support throughout your location intelligence journey. We handle the complex tasks to deliver analysis-ready datasets to you.
Business Needs: 1. Site Selection: Leverage footfall data to identify the best location to open a new store. By analyzing areas with high footfall you can select sites that are likely to attract more customers. 2. Urban Planning Development: City planners can use footfall data to optimize the layout and infrastructure of urban areas, guide the development of commercial areas by indicating where pedestrian traffic is heaviest, and aid in traffic management and safety measures. 3. Real Estate Investment: Leverage footfall data to identify lucrative investment opportunities and optimize property management by analyzing pedestrian traffic patterns.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The file set is a freely downloadable aggregation of information about Australian schools. The individual files represent a series of tables which, when considered together, form a relational database. The records cover the years 2008-2014 and include information on approximately 9500 primary and secondary school main-campuses and around 500 subcampuses. The records all relate to school-level data; no data about individuals is included. All the information has previously been published and is publicly available but it has not previously been released as a documented, useful aggregation. The information includes: (a) the names of schools (b) staffing levels, including full-time and part-time teaching and non-teaching staff (c) student enrolments, including the number of boys and girls (d) school financial information, including Commonwealth government, state government, and private funding (e) test data, potentially for school years 3, 5, 7 and 9, relating to an Australian national testing programme know by the trademark 'NAPLAN'
Documentation of this Edition 2016.1 is incomplete but the organization of the data should be readily understandable to most people. If you are a researcher, the simplest way to study the data is to make use of the SQLite3 database called 'school-data-2016-1.db'. If you are unsure how to use an SQLite database, ask a guru.
The database was constructed directly from the other included files by running the following command at a command-line prompt: sqlite3 school-data-2016-1.db < school-data-2016-1.sql Note that a few, non-consequential, errors will be reported if you run this command yourself. The reason for the errors is that the SQLite database is created by importing a series of '.csv' files. Each of the .csv files contains a header line with the names of the variable relevant to each column. The information is useful for many statistical packages but it is not what SQLite expects, so it complains about the header. Despite the complaint, the database will be created correctly.
Briefly, the data are organized as follows. (a) The .csv files ('comma separated values') do not actually use a comma as the field delimiter. Instead, the vertical bar character '|' (ASCII Octal 174 Decimal 124 Hex 7C) is used. If you read the .csv files using Microsoft Excel, Open Office, or Libre Office, you will need to set the field-separator to be '|'. Check your software documentation to understand how to do this. (b) Each school-related record is indexed by an identifer called 'ageid'. The ageid uniquely identifies each school and consequently serves as the appropriate variable for JOIN-ing records in different data files. For example, the first school-related record after the header line in file 'students-headed-bar.csv' shows the ageid of the school as 40000. The relevant school name can be found by looking in the file 'ageidtoname-headed-bar.csv' to discover that the the ageid of 40000 corresponds to a school called 'Corpus Christi Catholic School'. (3) In addition to the variable 'ageid' each record is also identified by one or two 'year' variables. The most important purpose of a year identifier will be to indicate the year that is relevant to the record. For example, if one turn again to file 'students-headed-bar.csv', one sees that the first seven school-related records after the header line all relate to the school Corpus Christi Catholic School with ageid of 40000. The variable that identifies the important differences between these seven records is the variable 'studentyear'. 'studentyear' shows the year to which the student data refer. One can see, for example, that in 2008, there were a total of 410 students enrolled, of whom 185 were girls and 225 were boys (look at the variable names in the header line). (4) The variables relating to years are given different names in each of the different files ('studentsyear' in the file 'students-headed-bar.csv', 'financesummaryyear' in the file 'financesummary-headed-bar.csv'). Despite the different names, the year variables provide the second-level means for joining information acrosss files. For example, if you wanted to relate the enrolments at a school in each year to its financial state, you might wish to JOIN records using 'ageid' in the two files and, secondarily, matching 'studentsyear' with 'financialsummaryyear'. (5) The manipulation of the data is most readily done using the SQL language with the SQLite database but it can also be done in a variety of statistical packages. (6) It is our intention for Edition 2016-2 to create large 'flat' files suitable for use by non-researchers who want to view the data with spreadsheet software. The disadvantage of such 'flat' files is that they contain vast amounts of redundant information and might not display the data in the form that the user most wants it. (7) Geocoding of the schools is not available in this edition. (8) Some files, such as 'sector-headed-bar.csv' are not used in the creation of the database but are provided as a convenience for researchers who might wish to recode some of the data to remove redundancy. (9) A detailed example of a suitable SQLite query can be found in the file 'school-data-sqlite-example.sql'. The same query, used in the context of analyses done with the excellent, freely available R statistical package (http://www.r-project.org) can be seen in the file 'school-data-with-sqlite.R'.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This collection contains aggregated metadata on environmental monitoring and observing activities from three Australian national research infrastructures (NRIs): biodiversity survey events from the Atlas of Living Australia (ALA), marine observations collected by the Integrated Marine Observing System (IMOS), and site-based monitoring and survey efforts by the Terrestrial Ecosystem Research Network (TERN). This dataset provides a summary breakdown of these efforts by survey topic, region, and time period from 2010 to the present.
Survey topics are mapped to an EcoAssets Earth Science Features vocabulary based on the Earth Science keywords from the Global Change Master Directory (GCMD) vocabulary, modified to use taxonomic concept URIs from the Australian National Species List (ANSL) in place of the GCMD Earth Science > Biological Classification vocabulary. ANSL categories map more readily to biodiversity survey categories, since GCMD depends on a top-level division between vertebrates and invertebrates rather than offering an animal category. The EcoAssets Earth Science Features vocabulary, including alternative keywords used in ALA, IMOS, or TERN datasets, is included in this collection.
The primary asset is aggregated_env_monitoring.csv. This contains all faceted data records for the period and supported facets related to time, space, and features observed.
Two derived assets (summary_monitoring_effort_terrestrial.csv, summary_monitoring_effort_marine.csv) further summarise the faceted data. Each is a pivot of the aggregated dataset.
vocabulary_earth_science_features.csv contains the hierarchical terms used within this asset to categorise earth science features. treeview_earth_science_features.txt provides a simpler, more readable view. keyword_mapping.csv shows the mappings between these terms and the keywords used in source datasets. The data_sources_env_monitoring.csv file includes information on the source datasets within the Atlas of Living Australia that contributed to this asset. Lineage: This dataset was created by the following pipeline:
Metadata records were collected from the TERN linked data portal (https://linkeddata.tern.org.au/) for all TERN monitoring sites and survey activities. Feature terms follow the TERN Feature Type vocabulary, mapped to the EcoAssets Earth Science Features vocabulary. For features that have been measured continuously at the site, metadata records were created for each relevant year since commission of the site. For other sites and features, metadata records were generated only for years in which the site was visited. TERN metadata records are associated with site coordinates.
Metadata records were harvested for datasets in the Australian Ocean Data Network (AODN, https://portal.aodn.org.au/) portal maintained by IMOS (iso19115-3.2018 format over OAI-PMH). Feature terms follow the GCMD keywords used in these metadata records. Metadata records were created for each year overlapping the data collection period for each dataset. Where the datasets were associated with a bounding box, records were created for each IMCRA region intersecting the bounding box.
Metadata records were created for each biodiversity sample event published to the ALA and associated with a Darwin Core event ID and a named sampling protocol (see https://dwc.tdwg.org/terms/#event). Events were excluded if the set of sampled taxa included multiple kingdoms OR the sampling protocol was associated with <50 samples OR no sample included >1 species. The remaining samples were mapped to feature terms based on the taxonomic scope of all species recorded for the associated protocol. Year and coordinates were taken from the associate sample event.
Metadata records from all sources were combined and include the following values. The feature facet values are offered as a convenience for grouping records without using the hierarchical structure of the EcoAssets Earth Science Features vocabulary:
• Source National Research Institute (NRI – one of ALA, IMOS, TERN) • Dataset name • Dataset URI • Original keyword from NRI (TERN feature type, IMOS GCMD keyword, ALA taxon) • Decimal latitude (where appropriate) • Decimal longitude (where appropriate) • Year • State or Territory • IBRA7 terrestrial region • IMCRA 4.0 mesoscale marine bioregion • Feature ID from EcoAssets Earth Science Features vocabulary • Feature name associated with feature ID • Feature facet 1 – high-level facet based on feature ID – a top-level GCMD Earth Science category (6 terms) • Feature facet 2 – intermediate-level facet based on feature ID – second-level GCMD/ANSL category (29 terms) • Feature facet 3 – lower-level facet with more fine-grained taxonomic structure based on feature ID – typically a third-level GCMD/ANSL category (36 terms)
https://pamepi.rondonia.fiocruz.br/en/covid_en.htmlhttps://pamepi.rondonia.fiocruz.br/en/covid_en.html
The current file contains community-level aggregate information extracted from health, human mobility, population inequality, and non-pharmaceutical interventions. The integration of variables from different sources facilitates the data analysis and epidemiological studies once the data set is aligned and represents a single entry for each city and day since the beginning of the pandemic in Brazil.
The data includes, for example, the daily time series of mild to moderate cases resulting from the Flu Syndrome database, hospital occupancy and deaths from the Severe Acute Respiratory Syndrome database, vaccine doses administered daily, etc.
To familiarize yourself with the data, a data explorer and dictionary are also available at https://pamepi.rondonia.fiocruz.br/en/aggregated_ en.html, and codes used to create the data set can be found on our GitHub directory https://github.com/PAMepi/PAMepi_scripts_datalake.git.
This work can be cited as: 1. Platform For Analytical Modelis in Epidemiology. (2022). GitHub directory: https://github.com/PAMepi/PAMepi_scripts_datalake.git. PAMepi/PAMepi_scripts_datalake: v1.0.0 (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.6384641
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article discusses the use of Bayesian methods for estimating logit demand models using aggregate data. We analyze two different demand systems: independent samples and consumer panel. Under the first system, there is a different and independent random sample of N consumers in each period and each consumer makes only a single purchase decision. Under the second system, the same N consumers make a purchase decision in each of T periods. Interestingly, there exists an asymptotic link between these two systems, which has important implications for the estimation of these demand models. The proposed methods are illustrated using simulated and real data.
The data underlying this published work have been made publicly available in this repository as part of the IMASC Data Management Plan. This work was supported as part of the Integrated Mesoscale Architectures for Sustainable Catalysis (IMASC), an Energy Frontier Research Center funded by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award # DE-SC0012573.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Modality-agnostic files were copied over and the CHANGES
file was updated. Data was aggregated using:
python phenotype.py aggregate subject -i segregated_subject -o aggregated_subject
phenotype.py
came from the GitHub repository: https://github.com/ericearl/bids-phenotype
A comprehensive clinical, MRI, and MEG collection characterizing healthy research volunteers collected at the National Institute of Mental Health (NIMH) Intramural Research Program (IRP) in Bethesda, Maryland using medical and mental health assessments, diagnostic and dimensional measures of mental health, cognitive and neuropsychological functioning, structural and functional magnetic resonance imaging (MRI), along with diffusion tensor imaging (DTI), and a comprehensive magnetoencephalography battery (MEG).
In addition, blood samples are currently banked for future genetic analysis. All data collected in this protocol are broadly shared in the OpenNeuro repository, in the Brain Imaging Data Structure (BIDS) format. In addition, blood samples of healthy volunteers are banked for future analyses. All data collected in this protocol are broadly shared here, in the Brain Imaging Data Structure (BIDS) format. In addition, task paradigms and basic pre-processing scripts are shared on GitHub. This dataset is unique in its depth of characterization of a healthy population in terms of brain health and will contribute to a wide array of secondary investigations of non-clinical and clinical research questions.
This dataset is licensed under the Creative Commons Zero (CC0) v1.0 License.
Inclusion criteria for the study require that participants are adults at or over 18 years of age in good health with the ability to read, speak, understand, and provide consent in English. All participants provided electronic informed consent for online screening and written informed consent for all other procedures. Exclusion criteria include:
Study participants are recruited through direct mailings, bulletin boards and listservs, outreach exhibits, print advertisements, and electronic media.
All potential volunteers first visit the study website (https://nimhresearchvolunteer.ctss.nih.gov), check a box indicating consent, and complete preliminary self-report screening questionnaires. The study website is HIPAA compliant and therefore does not collect PII ; instead, participants are instructed to contact the study team to provide their identity and contact information. The questionnaires include demographics, clinical history including medications, disability status (WHODAS 2.0), mental health symptoms (modified DSM-5 Self-Rated Level 1 Cross-Cutting Symptom Measure), substance use survey (DSM-5 Level 2), alcohol use (AUDIT), handedness (Edinburgh Handedness Inventory), and perceived health ratings. At the conclusion of the questionnaires, participants are again prompted to send an email to the study team. Survey results, supplemented by NIH medical records review (if present), are reviewed by the study team, who determine if the participant is likely eligible for the protocol. These participants are then scheduled for an in-person assessment. Follow-up phone screenings were also used to determine if participants were eligible for in-person screening.
At this visit, participants undergo a comprehensive clinical evaluation to determine final eligibility to be included as a healthy research volunteer. The mental health evaluation consists of a psychiatric diagnostic interview (Structured Clinical Interview for DSM-5 Disorders (SCID-5), along with self-report surveys of mood (Beck Depression Inventory-II (BD-II) and anxiety (Beck Anxiety Inventory, BAI) symptoms. An intelligence quotient (IQ) estimation is determined with the Kaufman Brief Intelligence Test, Second Edition (KBIT-2). The KBIT-2 is a brief (20-30 minute) assessment of intellectual functioning administered by a trained examiner. There are three subtests, including verbal knowledge, riddles, and matrices.
Medical evaluation includes medical history elicitation and systematic review of systems. Biological and physiological measures include vital signs (blood pressure, pulse), as well as weight, height, and BMI. Blood and urine samples are taken and a complete blood count, acute care panel, hepatic panel, thyroid stimulating hormone, viral markers (HCV, HBV, HIV), C-reactive protein, creatine kinase, urine drug screen and urine pregnancy tests are performed. In addition, blood samples that can be used for future genomic analysis, development of lymphoblastic cell lines or other biomarker measures are collected and banked with the NIMH Repository and Genomics Resource (Infinity BiologiX). The Family Interview for Genetic Studies (FIGS) was later added to the assessment in order to provide better pedigree information; the Adverse Childhood Events (ACEs) survey was also added to better characterize potential risk factors for psychopathology. The entirety of the in-person assessment not only collects information relevant for eligibility determination, but it also provides a comprehensive set of standardized clinical measures of volunteer health that can be used for secondary research.
Participants are given the option to consent for a magnetic resonance imaging (MRI) scan, which can serve as a baseline clinical scan to determine normative brain structure, and also as a research scan with the addition of functional sequences (resting state and diffusion tensor imaging). The MR protocol used was initially based on the ADNI-3 basic protocol, but was later modified to include portions of the ABCD protocol in the following manner:
At the time of the MRI scan, volunteers are administered a subset of tasks from the NIH Toolbox Cognition Battery. The four tasks include:
An optional MEG study was added to the protocol approximately one year after the study was initiated, thus there are relatively fewer MEG recordings in comparison to the MRI dataset. MEG studies are performed on a 275 channel CTF MEG system (CTF MEG, Coquiltam BC, Canada). The position of the head was localized at the beginning and end of each recording using three fiducial coils. These coils were placed 1.5 cm above the nasion, and at each ear, 1.5 cm from the tragus on a line between the tragus and the outer canthus of the eye. For 48 participants (as of 2/1/2022), photographs were taken of the three coils and used to mark the points on the T1 weighted structural MRI scan for co-registration. For the remainder of the participants (n=16 as of 2/1/2022), a Brainsight neuronavigation system (Rogue Research, Montréal, Québec, Canada) was used to coregister the MRI and fiducial localizer coils in realtime prior to MEG data acquisition.
Online and In-person behavioral and clinical measures, along with the corresponding phenotype file name, sorted first by measurement location and then by file name.
Location | Measure | File Name |
---|---|---|
Online | Alcohol Use Disorders Identification Test (AUDIT) | audit |
Demographics | demographics | |
DSM-5 Level 2 Substance Use - Adult | drug_use | |
Edinburgh Handedness Inventory (EHI) | ehi | |
Health History Form | health_history_questions | |
Perceived Health Rating - self | health_rating | |
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
** STAR traffic data per line per day** this dataset is replaced by the datasethttps://data.explore.star.fr/explore/dataset/tco-billettique-star-frequentation-agregee-td/information/ It will be definitively released at the end of May 2023 This dataset provides STAR traffic data per line per month. The data can be downloaded through the URLs transmitted in this dataset. At the beginning of each month, the attendance data for month N-2 are made available. For example, if we are in May, March data is available. The May data will therefore be available in early July. This dataset offers to download attendance data over a rolling year (no history prior to one year). The structure of the attendance file is as follows: - Column 1: Date of the day of operation - Column 2: Line ID (= lineo) - Column 3: Short name of the line in the commercial sense - Column 4: Attendance The information available in this file can be crossed with the information available on STAR opendata. Be careful, however, to the temporal desynchronisation of the data: opendata offers, for example, the current data of topology and hourly offer of the network while the attendance data is those recorded two months before (months in progress minus two months). So you have to think about keeping opendata information, managing your own versioning and/or using GTFS data to cross-reference the information. Particularities: . Line ID = 9000: Traffic data for bridge parks . Stop point identifier = 65501 or 9999: off-site attendance data. For technical reasons, attendance data, via validations, cannot be located (quantified but not usable data). The attendance can be relocated to the stop point level but not to the line level but it can also be relocated to the stop point and line level.
This session will focus on the baseline of skills that Data Liberation Initiative (DLI) Contacts should have and the corresponding training to achieve these skills. Introducing newcomers to the language of statistics and data is one of the important tasks of the orientation. Acquiring a technical language often poses a barrier to newcomers. To overcome this hurdle, newcomers must grasp both the meaning of new concepts and its abbreviated language of acronyms. Should we expect the orientation to offer all of the baseline skills or is other instruction needed? Do different local environments result in varying uses of DLI resources? Are the same skills needed among differing environments? How much attention should be paid during the orientation to different models of data service? For example, should the implications of buying services from elsewhere (e.g., Sherlock, IDLS, CHASS, Queen’s, etc.) be covered? What kind of distinctions need to be made for the levels of support for instructional and research uses of data? What about the reference uses of data, that is, using data to answer reference questions? Are there additional skills required of those supporting DLI data for research and reference uses? If there are, what are they and how should they be introduced?
Monthly report including total dispatched trips, total dispatched shared trips, and unique dispatched vehicles aggregated by FHV (For-Hire Vehicle) base. These have been tabulated from raw trip record submissions made by bases to the NYC Taxi and Limousine Commission (TLC). This dataset is typically updated monthly on a two-month lag, as bases have until the conclusion of the following month to submit a month of trip records to the TLC. In example, a base has until Feb 28 to submit complete trip records for January. Therefore, the January base aggregates will appear in March at the earliest. The TLC may elect to defer updates to the FHV Base Aggregate Report if a large number of bases have failed to submit trip records by the due date. Note: The TLC publishes base trip record data as submitted by the bases, and we cannot guarantee or confirm their accuracy or completeness. Therefore, this may not represent the total amount of trips dispatched by all TLC-licensed bases. The TLC performs routine reviews of the records and takes enforcement actions when necessary to ensure, to the extent possible, complete and accurate information.
Abstract copyright UK Data Service and data collection copyright owner.
The UK censuses took place on 21st April 1991. They were run by the Census Office for Northern Ireland, General Register Office for Scotland, and the Office of Population and Surveys for both England and Wales. The UK comprises the countries of England, Wales, Scotland and Northern Ireland.
Statistics from the UK censuses help paint a picture of the nation and how we live. They provide a detailed snapshot of the population and its characteristics, and underpin funding allocation to provide public services.
Population bases
Age and marital status
Communal establishments
Medical and care establishments
Hotels and other establishments
Ethnic group
Country of birth
Economic position
Economic position and ethnic group
Term-time address
Persons present
Long-term illness in households
Long-term illness in communal establishments
Long-term illness and economic position
Migrants
Wholly moving households
Ethnic group of migrants
Imputed residents
Imputed households
Tenure and amenities
Car availability
Rooms and household size
Persons per room
Residents 18 and over
Visitor households
Students in households
Households: 1971/'81/'91 bases
Dependants in households
Dependants and long-term illness
Carers
Dependent children in households
Households with children aged 0 - 15
Women in couples: economic position
Economic position of household residents
Age & marital status of household residents
Earners and dependent children
Young adults
Single years of age
Headship
Lone 'parents'
Shared accommodation
Household composition and housing
Household composition and ethnic group
Household composition and long-term illness
Migrant household heads
Households with dependent children; housing
Households with pensioners; housing
Households with dependants; housing
Ethnic group; housing
Country of birth; hold heads and residents
Country of birth and ethnic group
Language indicators
Lifestages
Occupancy (Occupied; vacant; other accommodation)
Household spaces and...
This dataset consists of Particle Size Distribution (PSD) measurements made on 419 archived topsoil samples and derived aggregate stability metrics from arable and grassland habitats across Great Britain in 2007. Laser granulometry was used to measure PSD of 1–2 mm aggregates before and after sonication and the difference in their Mean Weight Diameter (MWD) used to indicate aggregate stability. The samples were collected as part of the Countryside Survey monitoring programme, a unique study or ‘audit’ of the natural resources of the UK’s countryside. The analyses were conducted as part of study aiming to quantify how soil quality indicators change across a gradient of agricultural land management and to identify conditions that determine the ability of different soils to resist and recover from perturbations.
The Measurable AI FoodPanda Food & Grocery Transaction dataset is a leading source of email receipts and transaction data, offering data collected directly from users via Proprietary Consumer Apps, with millions of opt-in users.
We source our email receipt consumer data panel via two consumer apps which garner the express consent of our end-users (GDPR compliant). We then aggregate and anonymize all the transactional data to produce raw and aggregate datasets for our clients.
Use Cases Our clients leverage our datasets to produce actionable consumer insights such as: - Market share analysis - User behavioral traits (e.g. retention rates) - Average order values - Promotional strategies used by the key players. Several of our clients also use our datasets for forecasting and understanding industry trends better.
Coverage - Asia (Hong Kong, Taiwan, Singapore, Thailand, Malaysia, Philippines, Pakistan)
Granular Data Itemized, high-definition data per transaction level with metrics such as - Order value - Items ordered - No. of orders per user - Delivery fee - Service fee - Promotions used - Geolocation data and more
Aggregate Data - Weekly/ monthly order volume - Revenue delivered in aggregate form, with historical data dating back to 2018. All the transactional e-receipts are sent from the FoodPanda food delivery app to users’ registered accounts.
Most of our clients are fast-growing Tech Companies, Financial Institutions, Buyside Firms, Market Research Agencies, Consultancies and Academia.
Our dataset is GDPR compliant, contains no PII information and is aggregated & anonymized with user consent. Contact business@measurable.ai for a data dictionary and to find out our volume in each country.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Aggregated Australian species occurrence data from 1900 to the present using a suite of facets of most importance for environmental assessments. Occurrence records were aggregated and organised by the Atlas of Living Australia (ALA, https://ala.org.au/) and include survey and monitoring data collected and managed by the Integrated Marine Observing System (IMOS, https://imos.org.au/) and the Terrestrial Ecosystem Research Network (TERN, https://tern.org.au/).
Data from these infrastructures and other sources have been organised here as a national public-access dataset.
This dataset serves as a standardised snapshot of Australian biodiversity occurrence data from which many indicator datasets can more readily be derived (see Has Derivation entries below).
The primary asset is AggregatedData_AustralianSpeciesOccurrences_1.1.2023-06-13.csv. This contains all faceted data records for the period and supported facets related to time, space, taxonomy and conservation significance.
Six derived assets (SummaryData-ProtectionStatusAustralianMarineSpeciesOccurrences-1.1.2023-06-13.csv, SummaryData-ProtectionStatusAustralianTerrestrialSpeciesOccurrences-1.1.2023-06-13.csv, SummaryData-IntroducedSpeciesOccurrencesByMarineEcoregion-1.1.2023-06-13.csv, SummaryData-IntroducedSpeciesOccurrencesByTerrestrialEcoregion-1.1.2023-06-13.csv, SummaryData-ThreatenedSpeciesOccurrencesByMarineEcoregion-1.1.2023-06-13.csv, SummaryData-ThreatenedSpeciesOccurrencesByTerrestrialEcoregion-1.1.2023-06-13.csv) demonstrate uses supported by the faceted data. Each is a pivot of the aggregated dataset.
The data-sources.csv file includes information on the source datasets within the Atlas of Living Australia that contributed to this asset. README.txt documents the columns in each data file.
Grouping records from this dataset supports comparisons between the number of occurrence records for different regions and/or time periods and/or categories of species and occurrence data. Grouped counts of this kind may serve as useful indications of variation and change across the dimensions compared. Note however that such counts may not accurately reflect real differences in biodiversity. It is important to consider confounding factors (particularly variations in recording effort over time). Grouping all records by a single facet (e.g. IBRA region) may help to expose such factors.
These data are versioned at 12-month intervals. Previous versions will be linked below under Previous Version. The latest version can always be accessed at https://ecoassets.org.au/data/aggregated-data-australian-species-occurrences/.
Notes
GRIIS 1.6 includes a number of vertebrate species listed because some individuals have been translocated or (re-)introduced beyond their remaining ranges for conservation purposes. It is unhelpful for the current analysis to treat these as introduced species. These species were removed from the version of the GRIIS list used in this analysis. In future versions of GRIIS, these species will be documented as native species that have been translocated/reintroduced. Lineage: All species occurrence data aggregated by the ALA as of 2022-12-31 were filtered to include only:
Filtered data were processed to include the following elements:
Processed occurrence data were grouped to count records detected for each distinct combination of eleven primary facets. The resulting dataset is published as follows
This dataset includes the following elements:
Six derived summary datasets are also included. Each of this is a pivot of data in the main dataset and demonstrates a use case for the information:
These two datasets include the following columns:
These two datasets include the following columns:
These two datasets include the following columns:
This dataset consists of the 1km raster, dominant aggregate class version of the Land Cover Map 2015 (LCM2015) for Great Britain. The 1km dominant coverage product is based on the 1km percentage product and reports the aggregated habitat class with the highest percentage cover for each 1km pixel. The 10 aggregate classes are groupings of 21 target classes, which are based on the Joint Nature Conservation Committee (JNCC) Broad Habitats, which encompass the entire range of UK habitats. The aggregate classes group some of the more specialised classes into more general categories. For example, the five coastal classes in the target class are grouped into a single aggregate coastal class. This dataset is derived from the vector version of the Land Cover Map, which contains individual parcels of land cover and is the highest available spatial resolution. LCM2015 is a land cover map of the UK which was produced at the Centre for Ecology & Hydrology by classifying satellite images from 2014 and 2015 into 21 Broad Habitat-based classes. LCM2015 consists of a range of raster and vector products and users should familiarise themselves with the full range (see related records, the CEH web site and the LCM2015 Dataset documentation) to select the product most suited to their needs. LCM2015 was produced at the Centre for Ecology & Hydrology by classifying satellite images from 2014 and 2015 into 21 Broad Habitat-based classes. It is one of a series of land cover maps, produced by UKCEH since 1990. They include versions in 1990, 2000, 2007, 2015, 2017, 2018 and 2019. Full details about this dataset can be found at https://doi.org/10.5285/711c8dc1-0f4e-42ad-a703-8b5d19c92247
This dataset contains aggregate data on violent index victimizations at the quarter level of each year (i.e., January – March, April – June, July – September, October – December), from 2001 to the present (1991 to present for Homicides), with a focus on those related to gun violence. Index crimes are 10 crime types selected by the FBI (codes 1-4) for special focus due to their seriousness and frequency. This dataset includes only those index crimes that involve bodily harm or the threat of bodily harm and are reported to the Chicago Police Department (CPD). Each row is aggregated up to victimization type, age group, sex, race, and whether the victimization was domestic-related. Aggregating at the quarter level provides large enough blocks of incidents to protect anonymity while allowing the end user to observe inter-year and intra-year variation. Any row where there were fewer than three incidents during a given quarter has been deleted to help prevent re-identification of victims. For example, if there were three domestic criminal sexual assaults during January to March 2020, all victims associated with those incidents have been removed from this dataset. Human trafficking victimizations have been aggregated separately due to the extremely small number of victimizations. This dataset includes a " GUNSHOT_INJURY_I " column to indicate whether the victimization involved a shooting, showing either Yes ("Y"), No ("N"), or Unknown ("UKNOWN.") For homicides, injury descriptions are available dating back to 1991, so the "shooting" column will read either "Y" or "N" to indicate whether the homicide was a fatal shooting or not. For non-fatal shootings, data is only available as of 2010. As a result, for any non-fatal shootings that occurred from 2010 to the present, the shooting column will read as “Y.” Non-fatal shooting victims will not be included in this dataset prior to 2010; they will be included in the authorized dataset, but with "UNKNOWN" in the shooting column. The dataset is refreshed daily, but excludes the most recent complete day to allow CPD time to gather the best available information. Each time the dataset is refreshed, records can change as CPD learns more about each victimization, especially those victimizations that are most recent. The data on the Mayor's Office Violence Reduction Dashboard is updated daily with an approximately 48-hour lag. As cases are passed from the initial reporting officer to the investigating detectives, some recorded data about incidents and victimizations may change once additional information arises. Regularly updated datasets on the City's public portal may change to reflect new or corrected information. How does this dataset classify victims? The methodology by which this dataset classifies victims of violent crime differs by victimization type: Homicide and non-fatal shooting victims: A victimization is considered a homicide victimization or non-fatal shooting victimization depending on its presence in CPD's homicide victims data table or its shooting victims data table. A victimization is considered a homicide only if it is present in CPD's homicide data table, while a victimization is considered a non-fatal shooting only if it is present in CPD's shooting data tables and absent from CPD's homicide data table. To determine the IUCR code of homicide and non-fatal shooting victimizations, we defer to the incident IUCR code available in CPD's Crimes, 2001-present dataset (available on the City's open data portal). If the IUCR code in CPD's Crimes dataset is inconsistent with the homicide/non-fatal shooting categorization, we defer to CPD's Victims dataset. For a criminal homicide, the only sensible IUCR codes are 0110 (first-degree murder) or 0130 (second-degree murder). For a non-fatal shooting, a sensible IUCR code must signify a criminal sexual assault, a robbery, or, most commonly, an aggravated battery. In rare instances, the IUCR code in CPD's Crimes and Vi
Envestnet®| Yodlee®'s Consumer Purchase Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.
Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.
We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.
Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?
Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.
Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking
Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)
Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence
Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis