3 datasets found
  1. DATS 6401 - Final Project - Yon ho Cheong.zip

    • figshare.com
    zip
    Updated Dec 15, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yon ho Cheong (2018). DATS 6401 - Final Project - Yon ho Cheong.zip [Dataset]. http://doi.org/10.6084/m9.figshare.7471007.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 15, 2018
    Dataset provided by
    figshare
    Authors
    Yon ho Cheong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AbstractThe H1B is an employment-based visa category for temporary foreign workers in the United States. Every year, the US immigration department receives over 200,000 petitions and selects 85,000 applications through a random process and the U.S. employer must submit a petition for an H1B visa to the US immigration department. This is the most common visa status applied to international students once they complete college or higher education and begin working in a full-time position. The project provides essential information on job titles, preferred regions of settlement, foreign applicants and employers' trends for H1B visa application. According to locations, employers, job titles and salary range make up most of the H1B petitions, so different visualization utilizing tools will be used in order to analyze and interpreted in relation to the trends of the H1B visa to provide a recommendation to the applicant. This report is the base of the project for Visualization of Complex Data class at the George Washington University, some examples in this project has an analysis for the different relevant variables (Case Status, Employer Name, SOC name, Job Title, Prevailing Wage, Worksite, and Latitude and Longitude information) from Kaggle and Office of Foreign Labor Certification(OFLC) in order to see the H1B visa changes in the past several decades. Keywords: H1B visa, Data Analysis, Visualization of Complex Data, HTML, JavaScript, CSS, Tableau, D3.jsDatasetThe dataset contains 10 columns and covers a total of 3 million records spanning from 2011-2016. The relevant columns in the dataset include case status, employer name, SOC name, jobe title, full time position, prevailing wage, year, worksite, and latitude and longitude information.Link to dataset: https://www.kaggle.com/nsharan/h-1b-visaLink to dataset(FY2017): https://www.foreignlaborcert.doleta.gov/performancedata.cfmRunning the codeOpen Index.htmlData ProcessingDoing some data preprocessing to transform the raw data into an understandable format.Find and combine any other external datasets to enrich the analysis such as dataset of FY2017.To make appropriated Visualizations, variables should be Developed and compiled into visualization programs.Draw a geo map and scatter plot to compare the fastest growth in fixed value and in percentages.Extract some aspects and analyze the changes in employers’ preference as well as forecasts for the future trends.VisualizationsCombo chart: this chart shows the overall volume of receipts and approvals rate.Scatter plot: scatter plot shows the beneficiary country of birth.Geo map: this map shows All States of H1B petitions filed.Line chart: this chart shows top10 states of H1B petitions filed. Pie chart: this chart shows comparison of Education level and occupations for petitions FY2011 vs FY2017.Tree map: tree map shows overall top employers who submit the greatest number of applications.Side-by-side bar chart: this chart shows overall comparison of Data Scientist and Data Analyst.Highlight table: this table shows mean wage of a Data Scientist and Data Analyst with case status certified.Bubble chart: this chart shows top10 companies for Data Scientist and Data Analyst.Related ResearchThe H-1B Visa Debate, Explained - Harvard Business Reviewhttps://hbr.org/2017/05/the-h-1b-visa-debate-explainedForeign Labor Certification Data Centerhttps://www.foreignlaborcert.doleta.govKey facts about the U.S. H-1B visa programhttp://www.pewresearch.org/fact-tank/2017/04/27/key-facts-about-the-u-s-h-1b-visa-program/H1B visa News and Updates from The Economic Timeshttps://economictimes.indiatimes.com/topic/H1B-visa/newsH-1B visa - Wikipediahttps://en.wikipedia.org/wiki/H-1B_visaKey FindingsFrom the analysis, the government is cutting down the number of approvals for H1B on 2017.In the past decade, due to the nature of demand for high-skilled workers, visa holders have clustered in STEM fields and come mostly from countries in Asia such as China and India.Technical Jobs fill up the majority of Top 10 Jobs among foreign workers such as Computer Systems Analyst and Software Developers.The employers located in the metro areas thrive to find foreign workforce who can fill the technical position that they have in their organization.States like California, New York, Washington, New Jersey, Massachusetts, Illinois, and Texas are the prime location for foreign workers and provide many job opportunities. Top Companies such Infosys, Tata, IBM India that submit most H1B Visa Applications are companies based in India associated with software and IT services.Data Scientist position has experienced an exponential growth in terms of H1B visa applications and jobs are clustered in West region with the highest number.Visualization utilizing programsHTML, JavaScript, CSS, D3.js, Google API, Python, R, and Tableau

  2. z

    Classification of web-based Digital Humanities projects leveraging...

    • zenodo.org
    csv, tsv
    Updated Nov 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tommaso Battisti; Tommaso Battisti (2024). Classification of web-based Digital Humanities projects leveraging information visualisation techniques [Dataset]. http://doi.org/10.5281/zenodo.14192758
    Explore at:
    tsv, csvAvailable download formats
    Dataset updated
    Nov 28, 2024
    Dataset provided by
    Zenodo
    Authors
    Tommaso Battisti; Tommaso Battisti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description

    This dataset contains a list of 186 Digital Humanities projects leveraging information visualisation methods. Each project has been classified according to visualisation and interaction techniques, narrativity and narrative solutions, domain, methods for the representation of uncertainty and interpretation, and the employment of critical and custom approaches to visually represent humanities data.

    Classification schema: categories and columns

    The project_id column contains unique internal identifiers assigned to each project. Meanwhile, the last_access column records the most recent date (in DD/MM/YYYY format) on which each project was reviewed based on the web address specified in the url column.
    The remaining columns can be grouped into descriptive categories aimed at characterising projects according to different aspects:

    Narrativity. It reports the presence of narratives employing information visualisation techniques. Here, the term narrative encompasses both author-driven linear data stories and more user-directed experiences where the narrative sequence is composed of user exploration [1]. We define 2 columns to identify projects using visualisation techniques in narrative, or non-narrative sections. Both conditions can be true for projects employing visualisations in both contexts. Columns:

    • non_narrative (boolean)

    • narrative (boolean)

    Domain. The humanities domain to which the project is related. We rely on [2] and the chapters of the first part of [3] to abstract a set of general domains. Column:

    • domain (categorical):

      • History and archaeology

      • Art and art history

      • Language and literature

      • Music and musicology

      • Multimedia and performing arts

      • Philosophy and religion

      • Other: both extra-list domains and cases of collections without a unique or specific thematic focus.

    Visualisation of uncertainty and interpretation. Buiding upon the frameworks proposed by [4] and [5], a set of categories was identified, highlighting a distinction between precise and impressional communication of uncertainty. Precise methods explicitly represent quantifiable uncertainty such as missing, unknown, or uncertain data, precisely locating and categorising it using visual variables and positioning. Two sub-categories are interactive distinction, when uncertain data is not visually distinguishable from the rest of the data but can be dynamically isolated or included/excluded categorically through interaction techniques (usually filters); and visual distinction, when uncertainty visually “emerges” from the representation by means of dedicated glyphs and spatial or visual cues and variables. On the other hand, impressional methods communicate the constructed and situated nature of data [6], exposing the interpretative layer of the visualisation and indicating more abstract and unquantifiable uncertainty using graphical aids or interpretative metrics. Two sub-categories are: ambiguation, when the use of graphical expedients—like permeable glyph boundaries or broken lines—visually convey the ambiguity of a phenomenon; and interpretative metrics, when expressive, non-scientific, or non-punctual metrics are used to build a visualisation. Column:

    • uncertainty_interpretation (categorical):

      • Interactive distinction

      • Visual distinction

      • Ambiguation

      • Interpretative metrics

    Critical adaptation. We identify projects in which, for what concerns at least a visualisation, the following criteria are fulfilled: 1) avoid uncritical repurposing of prepackaged, generic-use, or ready-made solutions; 2) being tailored and unique to reflect the peculiarities of the phenomena at hand; 3) avoid extreme simplifications to embraces and depict complexity promoting time-spending visualisation-based inquiry. Column:

    • critical_adaptation (boolean)

    Non-temporal visualisation techniques. We adopt and partially adapt the terminology and definitions from [7]. A column is defined for each type of visualisation and accounts for its presence within a project, also including stacked layouts and more complex variations. Columns and inclusion criteria:

    • plot (boolean): visual representations that map data points onto a two-dimensional coordinate system.

    • cluster_or_set (bool): sets or cluster-based visualisations used to unveil possible inter-object similarities.

    • map (boolean): geographical maps used to show spatial insights. While we do not specify the variants of maps (e.g., pin maps, dot density maps, flow maps, etc.), we make an exception for maps where each data point is represented by another visualisation (e.g., a map where each data point is a pie chart) by accounting for the presence of both in their respective columns.

    • network (boolean): visual representations highlighting relational aspects through nodes connected by links or edges.

    • hierarchical_diagram (boolean): tree-like structures such as tree diagrams, radial trees, but also dendrograms. They differ from networks for their strictly hierarchical structure and absence of closed connection loops.

    • treemap (boolean): still hierarchical, but highlighting quantities expressed by means of area size. It also includes circle packing variants.

    • word_cloud (boolean): clouds of words, where each instance’s size is proportional to its frequency in a related context

    • bars (boolean): includes bar charts, histograms, and variants. It coincides with “bar charts” in [7] but with a more generic term to refer to all bar-based visualisations.

    • line_chart (boolean): the display of information as sequential data points connected by straight-line segments.

    • area_chart (boolean): similar to a line chart but with a filled area below the segments. It also includes density plots.

    • pie_chart (boolean): circular graphs divided into slices which can also use multi-level solutions.

    • plot_3d (boolean): plots that use a third dimension to encode an additional variable.

    • proportional_area (boolean): representations used to compare values through area size. Typically, using circle- or square-like shapes.

    • other (boolean): it includes all other types of non-temporal visualisations that do not fall into the aforementioned categories.

    Temporal visualisations and encodings. In addition to non-temporal visualisations, a group of techniques to encode temporality is considered in order to enable comparisons with [7]. Columns:

    • timeline (boolean): the display of a list of data points or spans in chronological order. They include timelines working either with a scale or simply displaying events in sequence. As in [7], we also include structured solutions resembling Gantt chart layouts.

    • temporal_dimension (boolean): to report when time is mapped to any dimension of a visualisation, with the exclusion of timelines. We use the term “dimension” and not “axis” as in [7] as more appropriate for radial layouts or more complex representational choices.

    • animation (boolean): temporality is perceived through an animation changing the visualisation according to time flow.

    • visual_variable (boolean): another visual encoding strategy is used to represent any temporality-related variable (e.g., colour).

    Interaction techniques. A set of categories to assess affordable interaction techniques based on the concept of user intent [8] and user-allowed data actions [9]. The following categories roughly match the “processing”, “mapping”, and “presentation” actions from [9] and the manipulative subset of methods of the “how” an interaction is performed in the conception of [10]. Only interactions that affect the visual representation or the aspect of data points, symbols, and glyphs are taken into consideration. Columns:

    • basic_selection (boolean): the demarcation of an element either for the duration of the interaction or more permanently until the occurrence of another selection.

    • advanced_selection (boolean): the demarcation involves both the selected element and connected elements within the visualisation or leads to brush and link effects across views. Basic selection is tacitly implied.

    • navigation (boolean): interactions that allow moving, zooming, panning, rotating, and scrolling the view but only when applied to the visualisation and not to the web page. It also includes “drill” interactions (to navigate through different levels or portions of data detail, often generating a new view that replaces or accompanies the original) and “expand” interactions generating new perspectives on data by expanding and collapsing nodes.

    • arrangement (boolean): methods to organise visualisation elements (symbols, glyphs, etc.) or

  3. Summary for Policymakers of the Working Group I Contribution to the IPCC...

    • catalogue.ceda.ac.uk
    • data-search.nerc.ac.uk
    Updated Aug 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joeri Rogelj; Chris Smith; Gian-Kasper Plattner; Malte Meinshausen; Sophie Szopa; Sebastian Milinski; Jochem Marotzke (2021). Summary for Policymakers of the Working Group I Contribution to the IPCC Sixth Assessment Report - data for Figure SPM.4 (v20210809) [Dataset]. https://catalogue.ceda.ac.uk/uuid/bd65331b1d344ccca44852e495d3a049
    Explore at:
    Dataset updated
    Aug 9, 2021
    Dataset provided by
    Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
    Authors
    Joeri Rogelj; Chris Smith; Gian-Kasper Plattner; Malte Meinshausen; Sophie Szopa; Sebastian Milinski; Jochem Marotzke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2015 - Dec 31, 2100
    Area covered
    Earth
    Description

    Data for Figure SPM.4 from the Summary for Policymakers (SPM) of the Working Group I (WGI) Contribution to the Intergovernmental Panel on Climate Change (IPCC) Sixth Assessment Report (AR6).

    Figure SPM.4 panel a shows global emissions projections for CO2 and a set of key non-CO2 climate drivers, for the core set of five IPCC AR6 scenarios. Figure SPM.4 panel b shows attributed warming in 2081-2100 relative to 1850-1900 for total anthropogenic, CO2, other greenhouse gases, and other anthropogenic forcings for five Shared Socio-economic Pathway (SSP) scenarios.

    How to cite this dataset

    When citing this dataset, please include both the data citation below (under 'Citable as') and the following citation for the report component from which the figure originates:

    IPCC, 2021: Summary for Policymakers. In: Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [Masson-Delmotte, V., P. Zhai, A. Pirani, S.L. Connors, C. Péan, S. Berger, N. Caud, Y. Chen, L. Goldfarb, M.I. Gomis, M. Huang, K. Leitzell, E. Lonnoy, J.B.R. Matthews, T.K. Maycock, T. Waterfield, O. Yelekçi, R. Yu, and B. Zhou (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, pp. 3−32, doi:10.1017/9781009157896.001.

    Figure subpanels

    The figure has two panels, with data provided for all panels in subdirectories named panel_a and panel_b.

    List of data provided

    This dataset contains:

    • Projected emissions from 2015 to 2100 for the five scenarios of the AR6 WGI core scenario set (SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0, SSP5-8.5)
    • Projected warming for all anthropogenic forcers, CO2 only, non-CO2 greenhouse gases (GHGs) only, and other anthropogenic components for 2081-2100 relative to 1850-1900, for SSP1-1.9, SSP1-2.6, SSP2-4.5, SSP3-7.0 and SSP5-8.5.

    The five illustrative SSP (Shared Socio-economic Pathway) scenarios are described in Box SPM.1 of the Summary for Policymakers and Section 1.6.1.1 of Chapter 1.

    Data provided in relation to figure

    Panel a:

    The first column includes the years, while the next columns include the data per scenario and per climate forcer for the line graphs.

    • Data file: Carbon_dioxide_Gt_CO2_yr.csv. relates to Carbon dioxide emissions panel
    • Data file: Methane_Mt_CO2_yr.csv. relates to Methane emissions panel
    • Data file: Nitrous_oxide_Mt N2O_yr.csv. relates to Nitrous oxide emissions panel
    • Data file: Sulfur_dioxide_Mt SO2_yr.csv. relates to Sulfur dioxide emissions panel

      Panel b:

    • Data file: ts_warming_ranges_1850-1900_base_panel_b.csv. [Rows 2 to 5 relate to the first bar chart (cyan). Rows 6 to 9 relate to the second bar chart (blue). Rows 10 to 13 relate to the third bar chart (orange). Rows 14 to 17 relate to the fourth bar chart (red). Rows 18 to 21 relate to the fifth bar chart (brown).].

    Sources of additional information

    The following weblink are provided in the Related Documents section of this catalogue record: - Link to the report webpage, which includes the report component containing the figure (Summary for Policymakers) and the Supplementary Material for Chapter 1, which contains details on the input data used in Table 1.SM.1..(Cross-Chapter Box 1.4, Figure 2). - Link to related publication for input data used in panel a.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yon ho Cheong (2018). DATS 6401 - Final Project - Yon ho Cheong.zip [Dataset]. http://doi.org/10.6084/m9.figshare.7471007.v1
Organization logo

DATS 6401 - Final Project - Yon ho Cheong.zip

Explore at:
zipAvailable download formats
Dataset updated
Dec 15, 2018
Dataset provided by
figshare
Authors
Yon ho Cheong
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

AbstractThe H1B is an employment-based visa category for temporary foreign workers in the United States. Every year, the US immigration department receives over 200,000 petitions and selects 85,000 applications through a random process and the U.S. employer must submit a petition for an H1B visa to the US immigration department. This is the most common visa status applied to international students once they complete college or higher education and begin working in a full-time position. The project provides essential information on job titles, preferred regions of settlement, foreign applicants and employers' trends for H1B visa application. According to locations, employers, job titles and salary range make up most of the H1B petitions, so different visualization utilizing tools will be used in order to analyze and interpreted in relation to the trends of the H1B visa to provide a recommendation to the applicant. This report is the base of the project for Visualization of Complex Data class at the George Washington University, some examples in this project has an analysis for the different relevant variables (Case Status, Employer Name, SOC name, Job Title, Prevailing Wage, Worksite, and Latitude and Longitude information) from Kaggle and Office of Foreign Labor Certification(OFLC) in order to see the H1B visa changes in the past several decades. Keywords: H1B visa, Data Analysis, Visualization of Complex Data, HTML, JavaScript, CSS, Tableau, D3.jsDatasetThe dataset contains 10 columns and covers a total of 3 million records spanning from 2011-2016. The relevant columns in the dataset include case status, employer name, SOC name, jobe title, full time position, prevailing wage, year, worksite, and latitude and longitude information.Link to dataset: https://www.kaggle.com/nsharan/h-1b-visaLink to dataset(FY2017): https://www.foreignlaborcert.doleta.gov/performancedata.cfmRunning the codeOpen Index.htmlData ProcessingDoing some data preprocessing to transform the raw data into an understandable format.Find and combine any other external datasets to enrich the analysis such as dataset of FY2017.To make appropriated Visualizations, variables should be Developed and compiled into visualization programs.Draw a geo map and scatter plot to compare the fastest growth in fixed value and in percentages.Extract some aspects and analyze the changes in employers’ preference as well as forecasts for the future trends.VisualizationsCombo chart: this chart shows the overall volume of receipts and approvals rate.Scatter plot: scatter plot shows the beneficiary country of birth.Geo map: this map shows All States of H1B petitions filed.Line chart: this chart shows top10 states of H1B petitions filed. Pie chart: this chart shows comparison of Education level and occupations for petitions FY2011 vs FY2017.Tree map: tree map shows overall top employers who submit the greatest number of applications.Side-by-side bar chart: this chart shows overall comparison of Data Scientist and Data Analyst.Highlight table: this table shows mean wage of a Data Scientist and Data Analyst with case status certified.Bubble chart: this chart shows top10 companies for Data Scientist and Data Analyst.Related ResearchThe H-1B Visa Debate, Explained - Harvard Business Reviewhttps://hbr.org/2017/05/the-h-1b-visa-debate-explainedForeign Labor Certification Data Centerhttps://www.foreignlaborcert.doleta.govKey facts about the U.S. H-1B visa programhttp://www.pewresearch.org/fact-tank/2017/04/27/key-facts-about-the-u-s-h-1b-visa-program/H1B visa News and Updates from The Economic Timeshttps://economictimes.indiatimes.com/topic/H1B-visa/newsH-1B visa - Wikipediahttps://en.wikipedia.org/wiki/H-1B_visaKey FindingsFrom the analysis, the government is cutting down the number of approvals for H1B on 2017.In the past decade, due to the nature of demand for high-skilled workers, visa holders have clustered in STEM fields and come mostly from countries in Asia such as China and India.Technical Jobs fill up the majority of Top 10 Jobs among foreign workers such as Computer Systems Analyst and Software Developers.The employers located in the metro areas thrive to find foreign workforce who can fill the technical position that they have in their organization.States like California, New York, Washington, New Jersey, Massachusetts, Illinois, and Texas are the prime location for foreign workers and provide many job opportunities. Top Companies such Infosys, Tata, IBM India that submit most H1B Visa Applications are companies based in India associated with software and IT services.Data Scientist position has experienced an exponential growth in terms of H1B visa applications and jobs are clustered in West region with the highest number.Visualization utilizing programsHTML, JavaScript, CSS, D3.js, Google API, Python, R, and Tableau

Search
Clear search
Close search
Google apps
Main menu