Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the proportion of traffic to each public Wikimedia project, from each known country, with some caveats.
This dataset represents an aggregate of 1:1000 sampled pageviews from the entirety of 2014. The pageviews definition applied was the Foundation's new pageviews definition; additionally, spiders and similar automata were filtered out with Tobie's ua-parser. Geolocation was then performed using MaxMind's geolocation products. There are no privacy implications that we could identify; The data comes from 1:1000 sampled logs, is proportionate rather than raw, and aggregates any nations with
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset offers insights into job postings, primarily focusing on roles in Data Engineering, Data Analysis, Data Science, and Machine Learning Engineering. It contains approximately 1583 records of job information, providing a snapshot of the employment landscape in these fields. The dataset is ideal for understanding market demands and trends.
The dataset is provided as a single CSV file, named 'job_dataset.csv'. It comprises 1583 rows and 8 columns, representing the structure of the collected job information. The data collection occurred around 26th July 2022.
This dataset is well-suited for various analytical tasks: * Cleaning and refining job data. * Identifying the most in-demand skills within the data and machine learning sectors. * Analysing the geographical distribution of jobs. * Conducting Natural Language Processing (NLP) and research on job descriptions. * Market analysis for job seekers, recruiters, and educational institutions.
The dataset has a global scope, with notable concentrations of job postings in locations such as Bengaluru, Karnataka (30%) and Gurgaon, Haryana (7%). The records primarily cover job postings for data-related roles, including Data Engineer, Data Analyst, Data Scientist, and ML Engineer, with data collected around July 2022. Some postings were listed over 30 days prior to the collection date.
CC0
This dataset is valuable for: * Data Scientists and Analysts: For market research, trend analysis, and skill demand assessment. * Machine Learning Engineers: To understand job requirements and role distributions. * Researchers: For academic studies on labour markets and skill development. * Job Seekers: To identify popular roles, required skills, and geographical opportunities. * Companies and Recruiters: For talent acquisition strategies and competitor analysis.
Original Data Source: Indeed job (Data science /data analyst/ ML)
There are several ArcInfo coverages described by this metadata record - FRAME, GEOL, MAPGRID, SITES, STRLINE and STRUC (in that order). Each coverage is described below. The data is also provided as shapefiles and ArcInfo interchange files. The data was used for the Mawson Escarpment Geology map published in 1998. This map is available from a URL provided in this metadata record.
FRAME:
The coverage FRAME contains (arcs) and (polygon, label) and forms the limits of the data sets or map coverage of the MAWSON ESCARPMENT area of the AUSTRALIAN ANTARCTIC TERRITORY.
The purpose or intentions for this dataset is to form a cookie cutter for future data which may be aquired and require clipping to the map/data area.
GEOL:
The coverage GEOL is historical geological data covering the MAWSON ESCARPMENT area.
The data were captured in ARC/INFO format and combined with geological outcrops that were accurately digitised over a March 1989 Landsat Thematic Mapper image at a scale of 1:100000. It is not recomended that this data be used beyond this scale.
The coverage contains Arcs (lines) and polygons (polygon labels). These object are attributed as fully as possible in their .aat file for arcs and .pat for polygon labels and conform with the Geoscience Australia Geoscience Data Dictionary Version 98.04
The purpose or intentions for the dataset is that it become part of a greater geological database of the Australian Antarctic Territory.
(1998-04-10 - 1998-06-30)
MAPGRID:
MAPGRID is a graticule that was generated as a 5 minute by 5 minute grid mainly to allow for good location/registration of source materials for digitising and adding some locational anno.mapgrat
This covers other function was to be used for a proof plot.
(1998-04-22 - 1998-06-30)
SITES:
The purpose or intentions for this dataset is to provide the approximate location of this historic data on sample sites in the MAWSON ESCARPMENT region of the AUSTRALIAN ANTARCTIC TERRITORY, for future expansion or more accurate positioning when improved records of location are found.
(1998-05-11 - 1998-06-30)
STRLINE:
This Structural lines for geology coverage is named (STRLINE).
The purpose or intentions for the dataset is to have the linear structural features in their own coverage containing only structure which does not form polygon boundaries.
(1998-05-28 - 1998-06-30)
STRUC:
This coverage called STRUC for structural measurements is a point coverage. It can be described as Mesoscopic structures at a site or outcrop.
The purpose or intentions for the dataset is to provide all the known structural point data information in the one coverage.
(1998-05-28 - 1998-06-30)
The California Rivers Assessment (CARA) is a computer-based data management system designed to give resource managers, policy-makers, landowners, scientists and interested citizens rapid access to essential information and tools with which to make sound decisions about the conservation and use of California's rivers.
The California Rivers Assessment has the following goals: To provide a
computerized forum for collecting, storing, analyzing, exchanging and
retrieving river-related resource data; Improve coordination between local,
state and federal agencies, other organizations and the interested public; To
develop a perspective on the demands and uses of California's river resources;
and establish a process for evaluating and assessing river resources on an
ongoing basis.
Although a substantial amount of information about California's rivers is now
stored in computers, the locations and formats for this information vary, often
making it difficult to access and use. The second phase of the California
Rivers Assessment is design of a data management system called an Aggregated
Information Model (AIM) that makes a wide range of river-related information
available at a single location in a consistent format. As in Phase I, the Reach
File system and Hydrologic Unit Codes provide a common, statewide geographic
reference framework for integrating data from different sources.
The development of the AIM began with the acquisition and integration of
computer-based river resource information on 13 of California's 149 river
basins. These "demonstration basins" were chosen to reflect California's wide
range of biological diversity. The Aggregated Information Model now
incorporates 60 or more data sets for each of 120 river basins. These layers
include vegetation, land ownership, dams, water quality parameters, rare and
endangered species, native fish, National Wetlands Inventory designations,
soils and farmlands inventories. By June 1998, all of California's 149 basins
will have a uniform set of aggregated data, as well as other specific local
data sets.
AIM allows users to produce custom maps from GIS layers by providing a query
system over the World Wide Web. "ICE MAPS" (Interactive California
Environmental Mapping, Assessment and Planning System) enables users to create
and download their own maps by defining a region within the state and selecting
desired data sets. Map products include a title bar, scale bar, legend, links
to related Internet sites and tabular data where available. A new version of
"ICE MAPS" is also available, that allows users to actually query the AIM data.
Using a variety of inputs, IFPRI's Spatial Production Allocation Model (SPAM, also known as MapSPAM) uses a cross-entropy approach to make plausible estimates of crop distribution within disaggregated units. Moving the data from coarser units such as countries and sub-national provinces, to finer units such as grid cells, reveals spatial patterns of crop performance, creating Africa South of the Sahara-wide grid-scape at the confluence between geography and agricultural production systems. Improving spatial understanding of crop production systems allows policymakers and donors to better target agricultural and rural development policies and investments, increasing food security and growth with minimal environmental impacts.
This dataset is a spatially enriched synthetic individual-level population of people aged between 15 and 49 years in Kogi State, Nigeria. This was developed through the process of Spatial Microsimulation (SMS). This involves a synergy of Multiple Indicator Cluster Survey (MICS 5) microdata and an analytical small-area zoning system which is an optimized surrogate of the ward-level geography of Kogi State, Nigeria. Whereas the actual MICS 5 microdata of Kogi State is a population sample of about 1,305 people comprising 912 females and 393 males, this synthetic population contains 2,249,170 microunit comprising 1,115,283 females and 1,133,887 males, with about 425 MICS 5 attributes. At the same time, the small-area zones to which each microunit/person belong is also indicated with primary keys named 'ZoneID' or ‘GRID_ID_15k'. The analytical zoning system for this synthetic population is included in this dataset as a separate shapefile. This synthetic population data is useful for Small-Area Estimation and mapping of relevant attributes. It enables robust spatial analysis of MICS 5 microdata and indicators at very fine spatial scales, as well as at individual-level in Kogi State, Nigeria. Owing to its enhanced spatial fidelity, this dataset is invaluable for supporting precise and equitable geographical targeting of Sustainable Development Goal (SDG) initiatives in developing countries like Nigeria.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the proportion of traffic to each public Wikimedia project, from each known country, with some caveats.
This dataset represents an aggregate of 1:1000 sampled pageviews from the entirety of 2014. The pageviews definition applied was the Foundation's new pageviews definition; additionally, spiders and similar automata were filtered out with Tobie's ua-parser. Geolocation was then performed using MaxMind's geolocation products. There are no privacy implications that we could identify; The data comes from 1:1000 sampled logs, is proportionate rather than raw, and aggregates any nations with