100+ datasets found
  1. d

    Global Web Data | Web Scraping Data | Job Postings Data | Source: Company...

    • datarade.ai
    .json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PredictLeads, Global Web Data | Web Scraping Data | Job Postings Data | Source: Company Website | 214M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-data-web-scraping-data-job-postings-dat-predictleads
    Explore at:
    .jsonAvailable download formats
    Dataset authored and provided by
    PredictLeads
    Area covered
    Bosnia and Herzegovina, Virgin Islands (British), Northern Mariana Islands, Comoros, French Guiana, Guadeloupe, Bonaire, El Salvador, Kosovo, Kuwait
    Description

    PredictLeads Job Openings Data provides high-quality hiring insights sourced directly from company websites - not job boards. Using advanced web scraping technology, our dataset offers real-time access to job trends, salaries, and skills demand, making it a valuable resource for B2B sales, recruiting, investment analysis, and competitive intelligence.

    Key Features:

    ✅214M+ Job Postings Tracked – Data sourced from 92 Million company websites worldwide. ✅7,1M+ Active Job Openings – Updated in real-time to reflect hiring demand. ✅Salary & Compensation Insights – Extract salary ranges, contract types, and job seniority levels. ✅Technology & Skill Tracking – Identify emerging tech trends and industry demands. ✅Company Data Enrichment – Link job postings to employer domains, firmographics, and growth signals. ✅Web Scraping Precision – Directly sourced from employer websites for unmatched accuracy.

    Primary Attributes:

    • id (string, UUID) – Unique identifier for the job posting.
    • type (string, constant: "job_opening") – Object type.
    • title (string) – Job title.
    • description (string) – Full job description, extracted from the job listing.
    • url (string, URL) – Direct link to the job posting.
    • first_seen_at – Timestamp when the job was first detected.
    • last_seen_at – Timestamp when the job was last detected.
    • last_processed_at – Timestamp when the job data was last processed.

    Job Metadata:

    • contract_types (array of strings) – Type of employment (e.g., "full time", "part time", "contract").
    • categories (array of strings) – Job categories (e.g., "engineering", "marketing").
    • seniority (string) – Seniority level of the job (e.g., "manager", "non_manager").
    • status (string) – Job status (e.g., "open", "closed").
    • language (string) – Language of the job posting.
    • location (string) – Full location details as listed in the job description.
    • Location Data (location_data) (array of objects)
    • city (string, nullable) – City where the job is located.
    • state (string, nullable) – State or region of the job location.
    • zip_code (string, nullable) – Postal/ZIP code.
    • country (string, nullable) – Country where the job is located.
    • region (string, nullable) – Broader geographical region.
    • continent (string, nullable) – Continent name.
    • fuzzy_match (boolean) – Indicates whether the location was inferred.

    Salary Data (salary_data)

    • salary (string) – Salary range extracted from the job listing.
    • salary_low (float, nullable) – Minimum salary in original currency.
    • salary_high (float, nullable) – Maximum salary in original currency.
    • salary_currency (string, nullable) – Currency of the salary (e.g., "USD", "EUR").
    • salary_low_usd (float, nullable) – Converted minimum salary in USD.
    • salary_high_usd (float, nullable) – Converted maximum salary in USD.
    • salary_time_unit (string, nullable) – Time unit for the salary (e.g., "year", "month", "hour").

    Occupational Data (onet_data) (object, nullable)

    • code (string, nullable) – ONET occupation code.
    • family (string, nullable) – Broad occupational family (e.g., "Computer and Mathematical").
    • occupation_name (string, nullable) – Official ONET occupation title.

    Additional Attributes:

    • tags (array of strings, nullable) – Extracted skills and keywords (e.g., "Python", "JavaScript").

    📌 Trusted by enterprises, recruiters, and investors for high-precision job market insights.

    PredictLeads Dataset: https://docs.predictleads.com/v3/guide/job_openings_dataset

  2. d

    Addresses (Open Data)

    • catalog.data.gov
    • data.tempe.gov
    • +10more
    Updated Jul 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2025). Addresses (Open Data) [Dataset]. https://catalog.data.gov/dataset/addresses-open-data
    Explore at:
    Dataset updated
    Jul 19, 2025
    Dataset provided by
    City of Tempe
    Description

    This dataset is a compilation of address point data for the City of Tempe. The dataset contains a point location, the official address (as defined by The Building Safety Division of Community Development) for all occupiable units and any other official addresses in the City. There are several additional attributes that may be populated for an address, but they may not be populated for every address. Contact: Lynn Flaaen-Hanna, Development Services Specialist Contact E-mail Link: Map that Lets You Explore and Export Address Data Data Source: The initial dataset was created by combining several datasets and then reviewing the information to remove duplicates and identify errors. This published dataset is the system of record for Tempe addresses going forward, with the address information being created and maintained by The Building Safety Division of Community Development.Data Source Type: ESRI ArcGIS Enterprise GeodatabasePreparation Method: N/APublish Frequency: WeeklyPublish Method: AutomaticData Dictionary

  3. Seair Exim Solutions

    • seair.co.in
    Updated Mar 31, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2015). Seair Exim Solutions [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Mar 31, 2015
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    United States
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  4. A

    Alternative Data Market Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Dec 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2024). Alternative Data Market Report [Dataset]. https://www.archivemarketresearch.com/reports/alternative-data-market-5021
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Dec 8, 2024
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    global
    Variables measured
    Market Size
    Description

    The Alternative Data Market size was valued at USD 7.20 billion in 2023 and is projected to reach USD 126.50 billion by 2032, exhibiting a CAGR of 50.6 % during the forecasts period. The use and processing of information that is not in financial databases is known as the alternative data market. Such data involves posts in social networks, satellite images, credit card transactions, web traffic and many others. It is mostly used in financial field to make the investment decisions, managing risks and analyzing competitors, giving a more general view on market trends as well as consumers’ attitude. It has been found that there is increasing requirement for the obtaining of data from unconventional sources as firms strive to nose ahead in highly competitive markets. Some current trend are the finding of AI and machine learning to drive large sets of data and the broadening utilization of the so called “Alternative Data” across industries that are not only the finance industry. Recent developments include: In April 2023, Thinknum Alternative Data launched new data fields to its employee sentiment datasets for people analytics teams and investors to use this as an 'employee NPS' proxy, and support highly-rated employers set up interviews through employee referrals. , In September 2022, Thinknum Alternative Data announced its plan to combine data Similarweb, SensorTower, Thinknum, Caplight, and Pathmatics with Lagoon, a sophisticated infrastructure platform to deliver an alternative data source for investment research, due diligence, deal sourcing and origination, and post-acquisition strategies in private markets. , In May 2022, M Science LLC launched a consumer spending trends platform, providing daily, weekly, monthly, and semi-annual visibility into consumer behaviors and competitive benchmarking. The consumer spending platform provided real-time insights into consumer spending patterns for Australian brands and an unparalleled business performance analysis. .

  5. w

    Global Data Element Market Research Report: By Data Source (Relational...

    • wiseguyreports.com
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wWiseguy Research Consultants Pvt Ltd (2024). Global Data Element Market Research Report: By Data Source (Relational Databases, NoSQL Databases, Big Data Platforms, Cloud-based Data Warehouses), By Type (Structured Data, Unstructured Data, Semi-Structured Data), By Format (XML, JSON, CSV, Parquet), By Purpose (Data Analysis, Machine Learning, Data Visualization, Data Governance), By Deployment Model (On-premises, Cloud-based, Hybrid) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/data-element-market
    Explore at:
    Dataset updated
    Jul 23, 2024
    Dataset authored and provided by
    wWiseguy Research Consultants Pvt Ltd
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Jan 7, 2024
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2024
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20237.6(USD Billion)
    MARKET SIZE 20248.66(USD Billion)
    MARKET SIZE 203224.7(USD Billion)
    SEGMENTS COVEREDData Source ,Type ,Format ,Purpose ,Deployment Model ,Regional
    COUNTRIES COVEREDNorth America, Europe, APAC, South America, MEA
    KEY MARKET DYNAMICSAIdriven data element management Data privacy and regulations Cloudbased data element platforms Data sharing and collaboration Increasing demand for realtime data
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDInformatica ,Micro Focus ,IBM ,SAS ,Denodo ,Oracle ,TIBCO ,Talend ,SAP
    MARKET FORECAST PERIOD2024 - 2032
    KEY MARKET OPPORTUNITIES1 Adoption of AI and ML 2 Growing demand for data analytics 3 Increasing cloud adoption 4 Data privacy and security concerns 5 Integration with emerging technologies
    COMPOUND ANNUAL GROWTH RATE (CAGR) 13.99% (2024 - 2032)
  6. o

    Data Source Type

    • opencontext.org
    Updated Sep 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David G. Anderson; Joshua Wells; Stephen Yerka; Sarah Whitcher Kansa; Eric C. Kansa (2022). Data Source Type [Dataset]. https://opencontext.org/predicates/6aeff869-47cf-4a32-920c-2ad037458bf9
    Explore at:
    Dataset updated
    Sep 29, 2022
    Dataset provided by
    Open Context
    Authors
    David G. Anderson; Joshua Wells; Stephen Yerka; Sarah Whitcher Kansa; Eric C. Kansa
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    An Open Context "predicates" dataset item. Open Context publishes structured data as granular, URL identified Web resources. This "Variables" record is part of the "Digital Index of North American Archaeology (DINAA)" data publication.

  7. Z

    Data from: A Large-scale Dataset of (Open Source) License Text Variants

    • data.niaid.nih.gov
    Updated Mar 31, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefano Zacchiroli (2022). A Large-scale Dataset of (Open Source) License Text Variants [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6379163
    Explore at:
    Dataset updated
    Mar 31, 2022
    Dataset authored and provided by
    Stefano Zacchiroli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We introduce a large-scale dataset of the complete texts of free/open source software (FOSS) license variants. To assemble it we have collected from the Software Heritage archive—the largest publicly available archive of FOSS source code with accompanying development history—all versions of files whose names are commonly used to convey licensing terms to software users and developers. The dataset consists of 6.5 million unique license files that can be used to conduct empirical studies on open source licensing, training of automated license classifiers, natural language processing (NLP) analyses of legal texts, as well as historical and phylogenetic studies on FOSS licensing. Additional metadata about shipped license files are also provided, making the dataset ready to use in various contexts; they include: file length measures, detected MIME type, detected SPDX license (using ScanCode), example origin (e.g., GitHub repository), oldest public commit in which the license appeared. The dataset is released as open data as an archive file containing all deduplicated license blobs, plus several portable CSV files for metadata, referencing blobs via cryptographic checksums.

    For more details see the included README file and companion paper:

    Stefano Zacchiroli. A Large-scale Dataset of (Open Source) License Text Variants. In proceedings of the 2022 Mining Software Repositories Conference (MSR 2022). 23-24 May 2022 Pittsburgh, Pennsylvania, United States. ACM 2022.

    If you use this dataset for research purposes, please acknowledge its use by citing the above paper.

  8. Success.ai | LinkedIn Company Data – Access 70M Companies & 700M Profiles at...

    • datarade.ai
    Updated Jan 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2022). Success.ai | LinkedIn Company Data – Access 70M Companies & 700M Profiles at Unbeatable Prices [Dataset]. https://datarade.ai/data-products/success-ai-linkedin-company-data-access-70m-companies-7-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Jan 1, 2022
    Dataset provided by
    Area covered
    Georgia, Tunisia, Martinique, Ascension and Tristan da Cunha, Suriname, Macedonia (the former Yugoslav Republic of), Singapore, Niger, India, Portugal
    Description

    Maximize your business potential with Success.ai's LinkedIn Company and Contact Data, a comprehensive solution designed to empower your business with strategic insights drawn from one of the largest professional networks in the world. This extensive dataset includes in-depth profiles from over 700 million professionals and 70 million companies globally, making it a goldmine for businesses aiming to enhance their marketing strategies, refine competitive intelligence, and drive robust B2B lead generation.

    Transform Your Email Marketing Efforts With Success.ai, tap into highly detailed and direct contact data to personalize your communications effectively. By accessing a vast array of email addresses, personalize your outreach efforts to dramatically improve engagement rates and conversion possibilities.

    Data Enrichment for Comprehensive Insights Integrate enriched LinkedIn data seamlessly into your CRM or any analytical system to gain a comprehensive understanding of your market landscape. This enriched view helps you navigate through complex business environments, enhancing decision-making and strategic planning.

    Elevate Your Online Marketing Deploy targeted and precision-based online marketing campaigns leveraging detailed professional data from LinkedIn. Tailor your messages and offers based on specific professional demographics, industry segments, and more, to optimize engagement and maximize online marketing ROI.

    Digital Advertising Optimized Utilize LinkedIn’s precise company and professional data to create highly targeted digital advertising campaigns. By understanding the profiles of key decision-makers, tailor your advertising strategies to resonate well with your target audience, ensuring high impact and better expenditure returns.

    Accelerate B2B Lead Generation Identify and connect directly with key stakeholders and decision-makers to shorten your sales cycles and close deals quicker. With access to high-level contacts in your industry, streamline your lead generation process and enhance the efficiency of your sales funnel.

    Why Partner with Success.ai for LinkedIn Data? - Competitive Pricing Assurance: Success.ai guarantees the most aggressive pricing, ensuring you receive unbeatable value for your investment in high-quality professional data. - Global Data Access: With coverage extending across 195 countries, tap into a rich reservoir of professional information, covering diverse industries and market segments. - High Data Accuracy: Backed by advanced AI technology and manual validation processes, our data accuracy rate stands at 99%, providing you with reliable and actionable insights. - Custom Data Integration: Receive tailored data solutions that fit seamlessly into your existing business processes, delivered in formats such as CSV and Parquet for easy integration. - Ethical Data Compliance: Our data sourcing and processing practices are fully compliant with global standards, ensuring ethical and responsible use of data. - Industry-wide Applications: Whether you’re in technology, finance, healthcare, or any other sector, our data solutions are designed to meet your specific industry needs.

    Strategic Use Cases for Enhanced Business Performance - Email Marketing: Leverage accurate contact details for personalized and effective email marketing campaigns. - Online Marketing and Digital Advertising: Use detailed demographic and professional data to refine your online presence and digital ad targeting. - Data Enrichment and B2B Lead Generation: Enhance your databases and accelerate your lead generation with enriched, up-to-date data. - Competitive Intelligence and Market Research: Stay ahead of the curve by using our data for deep market analysis and competitive research.

    With Success.ai, you’re not just accessing data; you’re unlocking a gateway to strategic business growth and enhanced market positioning. Start with Success.ai today to leverage our LinkedIn Company Data and transform your business operations with precision and efficiency.

    Did we mention that we'll beat any price on the market? Try us.

  9. Seair Exim Solutions

    • seair.co.in
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim, Seair Exim Solutions [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    United States
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  10. Seair Exim Solutions

    • seair.co.in
    Updated Jul 10, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2018). Seair Exim Solutions [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Jul 10, 2018
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    United States
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  11. d

    Hourly solar radiation in Langleys and three-digit data-source flag...

    • catalog.data.gov
    • search.dataone.org
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Hourly solar radiation in Langleys and three-digit data-source flag associated with the data, January 1, 1948 - September 30, 2015 [Dataset]. https://catalog.data.gov/dataset/hourly-solar-radiation-in-langleys-and-three-digit-data-source-flag-associated-with-the-30
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    U.S. Geological Survey
    Description

    The text file "Solar radiation.txt" contains hourly data and associated data-source flag from January 1, 1948, to September 30, 2015. The primary source of the data is the Argonne National Laboratory, Illinois. The first four columns give year, month, day and hour of the observation. Column 5 is the data in Langleys. Column 6 is the three-digit data-source flag to identify the solar radiation data processing and they indicate if the data are original or missing, the method that was used to fill the missing periods, and any other transformations of the data. Bera (2014) describes in detail an addition of a new flag based on the regression analysis of the backup data series at St. Charles (STC) for water years (WY) 2008–10. The user of the data should consult Over and others (2010) and Bera (2014) for the detailed documentation of this hourly data-source flag series. Reference Cited: Over, T.M., Price, T.H., and Ishii, A.L., 2010, Development and analysis of a meteorological database, Argonne National Laboratory, Illinois: U.S. Geological Survey Open File Report 2010-1220, 67 p., http://pubs.usgs.gov/of/2010/1220/. Bera, M., 2014, Watershed Data Management (WDM) database for Salt Creek streamflow simulation, DuPage County, Illinois, water years 2005-11: U.S. Geological Survey Data Series 870, 18 p., http://dx.doi.org/10.3133/ds870.

  12. d

    US Company Registry Data for Government Contracts | Verified Data: NAICS,...

    • datarade.ai
    .csv
    Updated Oct 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cosmic Link Data (2024). US Company Registry Data for Government Contracts | Verified Data: NAICS, Equipment, Certifications [Dataset]. https://datarade.ai/data-products/cosmic-link-verified-us-company-registry-data-for-governmen-cosmic-link-data
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Oct 26, 2024
    Dataset authored and provided by
    Cosmic Link Data
    Area covered
    United States
    Description

    Gain a Competitive Edge in the US Government Contracting Landscape with Cosmic Link Data

    Empower your business development and government contracting efforts with Cosmic Link Data, the premier source for comprehensive and verified data on US government contractors.

    What Makes Our Data Unique?

    Verified & Up-to-Date: Our data undergoes rigorous verification processes, ensuring accuracy and reliability.

    Real-Time & Historical: Make informed decisions with the latest data feeds, complemented by 60 days of historical data for trend analysis.

    Granular Insights: Dive deep into B2B company profiles, NAICS codes, equipment utilized, industries served, certifications, SAM registration status, and precise geolocation for targeted outreach.

    How is the Data Sourced?

    We cultivate a multi-source approach, combining verified B2B information, industry directories, government databases, and supplier self-reported data. This comprehensive strategy guarantees the most accurate and current picture of the US government contracting landscape.

    Primary Use Cases and Verticals:

    Business Development: Identify and target qualified government contractors for potential partnerships and collaborations.

    Government Contracting: Streamline your bidding process by pinpointing contractors with the specific capabilities and certifications required for government projects.

    Cosmic Link Data as Part of Your Broader Strategy:

    This data product seamlessly integrates with our broader US manufacturing data offerings. This allows you to gain a holistic view of the entire ecosystem, from raw materials and components to finished goods and distribution channels, with a specific focus on government contracting capabilities.

    By leveraging Cosmic Link Data, you can:

    Shorten sales cycles: Identify the right leads faster and increase your win rates on government contracts.

    Optimize resource allocation: Focus your resources on the most promising opportunities.

    Gain a competitive edge: Make data-driven decisions and confidently navigate the dynamic world of US government contracting.

  13. w

    Global Data Scraping Tools Market Research Report: By Deployment Mode...

    • wiseguyreports.com
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wWiseguy Research Consultants Pvt Ltd (2024). Global Data Scraping Tools Market Research Report: By Deployment Mode (Cloud, Web, On-Premises), By Data Source (Websites, Social Media, E-commerce Platforms, Databases, Flat Files), By Extraction Type (Structured Data, Semi-Structured Data, Unstructured Data), By Cloud Type (SaaS, PaaS, IaaS), By Application (Market Research, Price Monitoring, Lead Generation, Sentiment Analysis, Data Integration) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/data-scraping-tools-market
    Explore at:
    Dataset updated
    Jul 23, 2024
    Dataset authored and provided by
    wWiseguy Research Consultants Pvt Ltd
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Jan 7, 2024
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2024
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20233.24(USD Billion)
    MARKET SIZE 20243.73(USD Billion)
    MARKET SIZE 203211.46(USD Billion)
    SEGMENTS COVEREDDeployment Mode ,Data Source ,Extraction Type ,Cloud Type ,Application ,Regional
    COUNTRIES COVEREDNorth America, Europe, APAC, South America, MEA
    KEY MARKET DYNAMICS1 AIpowered data extraction 2 Growing demand for structured data 3 Cloudbased data scraping services 4 Realtime web data extraction 5 Increased use of web scraping for business intelligence
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDDexi.io ,Cheerio ,ScrapingBee ,Import.io ,Scrapinghub ,80legs ,Bright Data ,Mozenda ,Phantombuster ,Helium Scraper ,ScraperAPI ,Octoparse ,Apify ,ParseHub ,Diffbot
    MARKET FORECAST PERIOD2024 - 2032
    KEY MARKET OPPORTUNITIESAutomation for efficient data collection Realtime data extraction for enhanced decisionmaking Cloudbased tools for scalability and flexibility AIpowered tools for advanced data analysis Increased demand for web scraping in various industries
    COMPOUND ANNUAL GROWTH RATE (CAGR) 15.06% (2024 - 2032)
  14. i

    Global Financial Inclusion (Global Findex) Database 2021 - Malta

    • catalog.ihsn.org
    • microdata.worldbank.org
    Updated Dec 16, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Development Research Group, Finance and Private Sector Development Unit (2022). Global Financial Inclusion (Global Findex) Database 2021 - Malta [Dataset]. https://catalog.ihsn.org/catalog/10475
    Explore at:
    Dataset updated
    Dec 16, 2022
    Dataset authored and provided by
    Development Research Group, Finance and Private Sector Development Unit
    Time period covered
    2021
    Area covered
    Malta
    Description

    Abstract

    The fourth edition of the Global Findex offers a lens into how people accessed and used financial services during the COVID-19 pandemic, when mobility restrictions and health policies drove increased demand for digital services of all kinds.

    The Global Findex is the world's most comprehensive database on financial inclusion. It is also the only global demand-side data source allowing for global and regional cross-country analysis to provide a rigorous and multidimensional picture of how adults save, borrow, make payments, and manage financial risks. Global Findex 2021 data were collected from national representative surveys of about 128,000 adults in more than 120 economies. The latest edition follows the 2011, 2014, and 2017 editions, and it includes a number of new series measuring financial health and resilience and contains more granular data on digital payment adoption, including merchant and government payments.

    The Global Findex is an indispensable resource for financial service practitioners, policy makers, researchers, and development professionals.

    Geographic coverage

    National coverage

    Analysis unit

    Individual

    Kind of data

    Observation data/ratings [obs]

    Sampling procedure

    In most developing economies, Global Findex data have traditionally been collected through face-to-face interviews. Surveys are conducted face-to-face in economies where telephone coverage represents less than 80 percent of the population or where in-person surveying is the customary methodology. However, because of ongoing COVID-19 related mobility restrictions, face-to-face interviewing was not possible in some of these economies in 2021. Phone-based surveys were therefore conducted in 67 economies that had been surveyed face-to-face in 2017. These 67 economies were selected for inclusion based on population size, phone penetration rate, COVID-19 infection rates, and the feasibility of executing phone-based methods where Gallup would otherwise conduct face-to-face data collection, while complying with all government-issued guidance throughout the interviewing process. Gallup takes both mobile phone and landline ownership into consideration. According to Gallup World Poll 2019 data, when face-to-face surveys were last carried out in these economies, at least 80 percent of adults in almost all of them reported mobile phone ownership. All samples are probability-based and nationally representative of the resident adult population. Phone surveys were not a viable option in 17 economies that had been part of previous Global Findex surveys, however, because of low mobile phone ownership and surveying restrictions. Data for these economies will be collected in 2022 and released in 2023.

    In economies where face-to-face surveys are conducted, the first stage of sampling is the identification of primary sampling units. These units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. To increase the probability of contact and completion, attempts are made at different times of the day and, where possible, on different days. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households. Each eligible household member is listed, and the hand-held survey device randomly selects the household member to be interviewed. For paper surveys, the Kish grid method is used to select the respondent. In economies where cultural restrictions dictate gender matching, respondents are randomly selected from among all eligible adults of the interviewer's gender.

    In traditionally phone-based economies, respondent selection follows the same procedure as in previous years, using random digit dialing or a nationally representative list of phone numbers. In most economies where mobile phone and landline penetration is high, a dual sampling frame is used.

    The same respondent selection procedure is applied to the new phone-based economies. Dual frame (landline and mobile phone) random digital dialing is used where landline presence and use are 20 percent or higher based on historical Gallup estimates. Mobile phone random digital dialing is used in economies with limited to no landline presence (less than 20 percent).

    For landline respondents in economies where mobile phone or landline penetration is 80 percent or higher, random selection of respondents is achieved by using either the latest birthday or household enumeration method. For mobile phone respondents in these economies or in economies where mobile phone or landline penetration is less than 80 percent, no further selection is performed. At least three attempts are made to reach a person in each household, spread over different days and times of day.

    Sample size for Malta is 1000.

    Mode of data collection

    Landline and mobile telephone

    Research instrument

    Questionnaires are available on the website.

    Sampling error estimates

    Estimates of standard errors (which account for sampling error) vary by country and indicator. For country-specific margins of error, please refer to the Methodology section and corresponding table in Demirgüç-Kunt, Asli, Leora Klapper, Dorothe Singer, Saniya Ansar. 2022. The Global Findex Database 2021: Financial Inclusion, Digital Payments, and Resilience in the Age of COVID-19. Washington, DC: World Bank.

  15. V

    eVA Procurement Data 2024

    • data.virginia.gov
    • opendata.winchesterva.gov
    csv
    Updated Dec 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of General Services (2024). eVA Procurement Data 2024 [Dataset]. https://data.virginia.gov/dataset/eva-procurement-data-2024
    Explore at:
    csv(1161429190), csv(11704)Available download formats
    Dataset updated
    Dec 5, 2024
    Dataset authored and provided by
    Department of General Services
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    eVA is used by more than 245 state agencies and institutes of higher education, and 900+ local governments and public bodies, to announce bidding opportunities, receive quotes, order placement & approvals, contract management and more. Since its inception in 2001, eVA has transformed the way the Commonwealth buys goods and services from a decentralized, paper based process to a centralized, electronic platform. Providing industry leading procurement solutions for all public bodies, the marketplace includes nearly 100,000 businesses competing to provide the Commonwealth with quality goods and services, resulting in more than $25 million in savings annually. This dataset contains the purchase orders for the goods and services.

  16. Seair Exim Solutions

    • seair.co.in
    Updated Jan 16, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2015). Seair Exim Solutions [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Jan 16, 2015
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    United States
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  17. AFRL Radiosonde and Thermosonde Data (Source Format)

    • data.ucar.edu
    ascii
    Updated Dec 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Donald J. Mattes; Edmund A. Murphy; George Y. Jumper; John R. Roadcap; John W. Myers; Paul Tracy (2024). AFRL Radiosonde and Thermosonde Data (Source Format) [Dataset]. http://doi.org/10.26023/6HCG-E9WC-H0E
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Dec 26, 2024
    Dataset provided by
    University Corporation for Atmospheric Research
    Authors
    Donald J. Mattes; Edmund A. Murphy; George Y. Jumper; John R. Roadcap; John W. Myers; Paul Tracy
    Time period covered
    Mar 21, 2006 - Apr 6, 2006
    Area covered
    Description

    The Air Force Research Laboratory was a participant in the T-REX campaign from 20 March 2006 through 6 April 2006, which included IOPs 6 through 9. 16 thermosondes with their radiosondes and 10 additional radiosondes are included in this data set. The sondes were released from the Ash Mountain Helibase near the Ash Mountain Entrance to the Sequoia National Park.

  18. O

    Open Source Data Labeling Tool Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Mar 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Open Source Data Labeling Tool Report [Dataset]. https://www.marketresearchforecast.com/reports/open-source-data-labeling-tool-28519
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Mar 7, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The open-source data labeling tool market is experiencing robust growth, driven by the increasing demand for high-quality training data in the burgeoning artificial intelligence (AI) and machine learning (ML) sectors. The market's expansion is fueled by several key factors. Firstly, the rising adoption of AI across various industries, including healthcare, automotive, and finance, necessitates large volumes of accurately labeled data. Secondly, open-source tools offer a cost-effective alternative to proprietary solutions, making them attractive to startups and smaller companies with limited budgets. Thirdly, the collaborative nature of open-source development fosters continuous improvement and innovation, leading to more sophisticated and user-friendly tools. While the cloud-based segment currently dominates due to scalability and accessibility, on-premise solutions maintain a significant share, especially among organizations with stringent data security and privacy requirements. The geographical distribution reveals strong growth in North America and Europe, driven by established tech ecosystems and early adoption of AI technologies. However, the Asia-Pacific region is expected to witness significant growth in the coming years, fueled by increasing digitalization and government initiatives promoting AI development. The market faces some challenges, including the need for skilled data labelers and the potential for inconsistencies in data quality across different open-source tools. Nevertheless, ongoing developments in automation and standardization are expected to mitigate these concerns. The forecast period of 2025-2033 suggests a continued upward trajectory for the open-source data labeling tool market. Assuming a conservative CAGR of 15% (a reasonable estimate given the rapid advancements in AI and the increasing need for labeled data), and a 2025 market size of $500 million (a plausible figure considering the significant investments in the broader AI market), the market is projected to reach approximately $1.8 billion by 2033. This growth will be further shaped by the ongoing development of new features, improved user interfaces, and the integration of advanced techniques such as active learning and semi-supervised learning within open-source tools. The competitive landscape is dynamic, with both established players and emerging startups contributing to the innovation and expansion of this crucial segment of the AI ecosystem. Companies are focusing on improving the accuracy, efficiency, and accessibility of their tools to cater to a growing and diverse user base.

  19. Z

    Enterprise-Driven Open Source Software

    • data.niaid.nih.gov
    • opendatalab.com
    • +1more
    Updated Apr 22, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kotti, Zoe (2020). Enterprise-Driven Open Source Software [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3653877
    Explore at:
    Dataset updated
    Apr 22, 2020
    Dataset provided by
    Kravvaritis, Konstantinos
    Theodorou, Georgios
    Louridas, Panos
    Spinellis, Diomidis
    Kotti, Zoe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present a dataset of open source software developed mainly by enterprises rather than volunteers. This can be used to address known generalizability concerns, and, also, to perform research on open source business software development. Based on the premise that an enterprise's employees are likely to contribute to a project developed by their organization using the email account provided by it, we mine domain names associated with enterprises from open data sources as well as through white- and blacklisting, and use them through three heuristics to identify 17,264 enterprise GitHub projects. We provide these as a dataset detailing their provenance and properties. A manual evaluation of a dataset sample shows an identification accuracy of 89%. Through an exploratory data analysis we found that projects are staffed by a plurality of enterprise insiders, who appear to be pulling more than their weight, and that in a small percentage of relatively large projects development happens exclusively through enterprise insiders.

    The main dataset is provided as a 17,264 record tab-separated file named enterprise_projects.txt with the following 29 fields.

    url: the project's GitHub URL

    project_id: the project's GHTorrent identifier

    sdtc: true if selected using the same domain top committers heuristic (9,016 records)

    mcpc: true if selected using the multiple committers from a valid enterprise heuristic (8,314 records)

    mcve: true if selected using the multiple committers from a probable company heuristic (8,015 records),

    star_number: number of GitHub watchers

    commit_count: number of commits

    files: number of files in current main branch

    lines: corresponding number of lines in text files

    pull_requests: number of pull requests

    github_repo_creation: timestamp of the GitHub repository creation

    earliest_commit: timestamp of the earliest commit

    most_recent_commit: date of the most recent commit

    committer_count: number of different committers

    author_count: number of different authors

    dominant_domain: the projects dominant email domain

    dominant_domain_committer_commits: number of commits made by committers whose email matches the project's dominant domain

    dominant_domain_author_commits: corresponding number for commit authors

    dominant_domain_committers: number of committers whose email matches the project's dominant domain

    dominant_domain_authors: corresponding number for commit authors

    cik: SEC's EDGAR "central index key"

    fg500: true if this is a Fortune Global 500 company (2,233 records)

    sec10k: true if the company files SEC 10-K forms (4,180 records)

    sec20f: true if the company files SEC 20-F forms (429 records)

    project_name: GitHub project name

    owner_login: GitHub project's owner login

    company_name: company name as derived from the SEC and Fortune 500 data

    owner_company: GitHub project's owner company name

    license: SPDX license identifier

    The file cohost_project_details.txt provides the full set of 311,223 cohort projects that are not part of the enterprise data set, but have comparable quality attributes.

    url: the project's GitHub URL

    project_id: the project's GHTorrent identifier

    stars: number of GitHub watchers

    commit_count: number of commits

  20. d

    Circa 1956 Land Area in Coastal Louisiana - Original Data Source - National...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Circa 1956 Land Area in Coastal Louisiana - Original Data Source - National Wetlands Inventory - Revisions to Georectification [Dataset]. https://catalog.data.gov/dataset/circa-1956-land-area-in-coastal-louisiana-original-data-source-national-wetlands-inventory
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Louisiana
    Description

    The dataset presented here represents a circa 1956 land/water delineation of coastal Louisiana used in part of a larger study to quantify landscape changes from 1932 to 2016. The original dataset was created by the U.S. Fish and Wildlife Service, Office of Biological Services. The USGS Wetland and Aquatic Research Center altered the original data by improving the geo-rectification in specific areas known to contain geo-rectification error, most notably in coastal wetland areas in the vicinity of Four League Bay in western Terrebonne Basin. The dataset contains two categories, land and water. For the purposes of this effort, land includes areas characterized by emergent vegetation, upland, wetland forest, or scrub-shrub were classified as land, while open water, aquatic beds, and mudflats were classified as water. For additional information regarding this dataset (other than geo-rectification revisions), please contact the dataset originator, the U.S. Fish and Wildlife Service (USFWS).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
PredictLeads, Global Web Data | Web Scraping Data | Job Postings Data | Source: Company Website | 214M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-data-web-scraping-data-job-postings-dat-predictleads

Global Web Data | Web Scraping Data | Job Postings Data | Source: Company Website | 214M+ Records

Explore at:
.jsonAvailable download formats
Dataset authored and provided by
PredictLeads
Area covered
Bosnia and Herzegovina, Virgin Islands (British), Northern Mariana Islands, Comoros, French Guiana, Guadeloupe, Bonaire, El Salvador, Kosovo, Kuwait
Description

PredictLeads Job Openings Data provides high-quality hiring insights sourced directly from company websites - not job boards. Using advanced web scraping technology, our dataset offers real-time access to job trends, salaries, and skills demand, making it a valuable resource for B2B sales, recruiting, investment analysis, and competitive intelligence.

Key Features:

✅214M+ Job Postings Tracked – Data sourced from 92 Million company websites worldwide. ✅7,1M+ Active Job Openings – Updated in real-time to reflect hiring demand. ✅Salary & Compensation Insights – Extract salary ranges, contract types, and job seniority levels. ✅Technology & Skill Tracking – Identify emerging tech trends and industry demands. ✅Company Data Enrichment – Link job postings to employer domains, firmographics, and growth signals. ✅Web Scraping Precision – Directly sourced from employer websites for unmatched accuracy.

Primary Attributes:

  • id (string, UUID) – Unique identifier for the job posting.
  • type (string, constant: "job_opening") – Object type.
  • title (string) – Job title.
  • description (string) – Full job description, extracted from the job listing.
  • url (string, URL) – Direct link to the job posting.
  • first_seen_at – Timestamp when the job was first detected.
  • last_seen_at – Timestamp when the job was last detected.
  • last_processed_at – Timestamp when the job data was last processed.

Job Metadata:

  • contract_types (array of strings) – Type of employment (e.g., "full time", "part time", "contract").
  • categories (array of strings) – Job categories (e.g., "engineering", "marketing").
  • seniority (string) – Seniority level of the job (e.g., "manager", "non_manager").
  • status (string) – Job status (e.g., "open", "closed").
  • language (string) – Language of the job posting.
  • location (string) – Full location details as listed in the job description.
  • Location Data (location_data) (array of objects)
  • city (string, nullable) – City where the job is located.
  • state (string, nullable) – State or region of the job location.
  • zip_code (string, nullable) – Postal/ZIP code.
  • country (string, nullable) – Country where the job is located.
  • region (string, nullable) – Broader geographical region.
  • continent (string, nullable) – Continent name.
  • fuzzy_match (boolean) – Indicates whether the location was inferred.

Salary Data (salary_data)

  • salary (string) – Salary range extracted from the job listing.
  • salary_low (float, nullable) – Minimum salary in original currency.
  • salary_high (float, nullable) – Maximum salary in original currency.
  • salary_currency (string, nullable) – Currency of the salary (e.g., "USD", "EUR").
  • salary_low_usd (float, nullable) – Converted minimum salary in USD.
  • salary_high_usd (float, nullable) – Converted maximum salary in USD.
  • salary_time_unit (string, nullable) – Time unit for the salary (e.g., "year", "month", "hour").

Occupational Data (onet_data) (object, nullable)

  • code (string, nullable) – ONET occupation code.
  • family (string, nullable) – Broad occupational family (e.g., "Computer and Mathematical").
  • occupation_name (string, nullable) – Official ONET occupation title.

Additional Attributes:

  • tags (array of strings, nullable) – Extracted skills and keywords (e.g., "Python", "JavaScript").

📌 Trusted by enterprises, recruiters, and investors for high-precision job market insights.

PredictLeads Dataset: https://docs.predictleads.com/v3/guide/job_openings_dataset

Search
Clear search
Close search
Google apps
Main menu