5 datasets found

f
Supplementary Figure SCA analysis using a manually curated...
figshare.com
zip
Updated Aug 26, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raffaele Calogero (2020). Supplementary Figure SCA analysis using a manually curated cancer-immune-signature (SCA tutorial) [Dataset]. http://doi.org/10.6084/m9.figshare.12867029.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12867029.v1
Dataset updated
Aug 26, 2020
Dataset provided by
figshare
Authors
Raffaele Calogero
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset used to assemble Figure SCA analysis using a manually curated cancer-immune-signature in SCAtutorial vignette (https://kendomaniac.github.io/SCAtutorial/articles/SCAvignette.html)
Pokemon data mining 2020
kaggle.com
Updated Jul 31, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AJ Pass (2020). Pokemon data mining 2020 [Dataset]. https://www.kaggle.com/ajpass/pokemon-data-mining-2020/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 31, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
AJ Pass
Description
Context

This dataset was obtained using a web scrapper made in this notebook as learning purposes for data mining and web scrapping:

https://www.kaggle.com/ajpass/data-mining-web-scrapper-vol-1-pokedex

Content

Inside this dataset are the diferrent generations of pokemons with all their stats.

Acknowledgements

This dataset was come from the knowledge I learned following a tutorial a year ago and because I couldn't find it I made a version with what I remembered.
f
Augmenting geovisual analytics of social media data with heterogeneous...
plos.figshare.com
figshare.com
docx
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Savelyev; Alan M. MacEachren (2023). Augmenting geovisual analytics of social media data with heterogeneous information network mining—Cognitive plausibility assessment [Dataset]. http://doi.org/10.1371/journal.pone.0206906
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0206906
Dataset updated
Jun 2, 2023
Dataset provided by
PLOS ONE
Authors
Alexander Savelyev; Alan M. MacEachren
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This paper investigates the feasibility, from a user perspective, of integrating a heterogeneous information network mining (HINM) technique into SensePlace3 (SP3), a web-based geovisual analytics environment. The core contribution of this paper is a user study that determines whether an analyst with minimal background can comprehend the network data modeling metaphors employed by the resulting system, whether they can employ said metaphors to explore spatial data, and whether they can interpret the results of such spatial analysis correctly. This study confirms that all of the above is, indeed, possible, and provides empirical evidence about the importance of a hands-on tutorial and a graphical approach to explaining data modeling metaphors in the successful adoption of advanced data mining techniques. Analysis of outcomes of data exploration by the study participants also demonstrates the kinds of insights that a visual interface to HINM can enable. A second contribution is a realistic case study that demonstrates that our HINM approach (made accessible through a visual interface that provides immediate visual feedback for user queries), produces a clear and a positive difference in the outcome of spatial analysis. Although this study does not aim to validate HINM as a data modeling approach (there is considerable evidence for this in existing literature), the results of the case study suggest that HINM holds promise in the (geo)visual analytics domain as well, particularly when integrated into geovisual analytics applications. A third contribution is a user study protocol that is based on and improves upon the current methodological state of the art. This protocol includes a hands-on tutorial and a set of realistic data analysis tasks. Detailed evaluation protocols are rare in geovisual analytics (and in visual analytics more broadly), with most studies reviewed in this paper failing to provide sufficient details for study replication or comparison work.
f
Visual Data Mining of Biological Networks: One Size Does Not Fit All
plos.figshare.com
xml
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chiara Pastrello; David Otasek; Kristen Fortney; Giuseppe Agapito; Mario Cannataro; Elize Shirdel; Igor Jurisica (2023). Visual Data Mining of Biological Networks: One Size Does Not Fit All [Dataset]. http://doi.org/10.1371/journal.pcbi.1002833
Explore at:
xmlAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1002833
Dataset updated
May 31, 2023
Dataset provided by
PLOS Computational Biology
Authors
Chiara Pastrello; David Otasek; Kristen Fortney; Giuseppe Agapito; Mario Cannataro; Elize Shirdel; Igor Jurisica
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
High-throughput technologies produce massive amounts of data. However, individual methods yield data specific to the technique used and biological setup. The integration of such diverse data is necessary for the qualitative analysis of information relevant to hypotheses or discoveries. It is often useful to integrate these datasets using pathways and protein interaction networks to get a broader view of the experiment. The resulting network needs to be able to focus on either the large-scale picture or on the more detailed small-scale subsets, depending on the research question and goals. In this tutorial, we illustrate a workflow useful to integrate, analyze, and visualize data from different sources, and highlight important features of tools to support such analyses.
Data from: Dataset for Vector space model and the usage patterns of...
figshare.com
bin
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gede Primahadi Wijaya Rajeg; Karlina Denistia; Simon Musgrave (2023). Dataset for Vector space model and the usage patterns of Indonesian denominal verbs [Dataset]. http://doi.org/10.6084/m9.figshare.8187155.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8187155.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Gede Primahadi Wijaya Rajeg; Karlina Denistia; Simon Musgrave
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PrefaceThis is the data repository for the paper accepted for publication in NUSA's special issue on Linguistic studies using large annotated corpora (co-edited by Hiroki Nomoto and David Moeljadi).How to cite the datasetIf you use, adapt, and/or modify any of the dataset in this repository for your research or teaching purposes (except for the malindo_dbase, see below), please cite as:Rajeg, Gede Primahadi Wijaya; Denistia, Karlina; Musgrave, Simon (2019): Dataset for Vector space model and the usage patterns of Indonesian denominal verbs. figshare. Fileset. https://doi.org/10.6084/m9.figshare.8187155.Alternatively, click on the dark pink Cite button to browse different citation style (default is DataCite).The malindo_dbase data in this repository is from Nomoto et al. (2018) (cf the GitHub repository). So please also cite their work if you use it for your research:Nomoto, Hiroki, Hannah Choi, David Moeljadi and Francis Bond. 2018. MALINDO Morph: Morphological dictionary and analyser for Malay/Indonesian. Kiyoaki Shirai (ed.) Proceedings of the LREC 2018 Workshop "The 13th Workshop on Asian Language Resources", 36-43.Tutorial on how to use the data together with the R Markdown Notebook for the analyses is available on GitHub and figshare:Rajeg, Gede Primahadi Wijaya; Denistia, Karlina; Musgrave, Simon (2019): R Markdown Notebook for Vector space model and the usage patterns of Indonesian denominal verbs. figshare. Software. doi: https://doi.org/10.6084/m9.figshare.9970205Dataset description1. Leipzig_w2v_vector_full.bin is the vector space model used in the paper. We built it using wordVectors package (Schmidt & Li 2017) via the MonARCH High Performance Computing Cluster (We thank Philip Chan for his help with access to MonARCH).2. Files beginning with ngramexmpl_... are data for the n-grams (i.e. words sequence) of verbs discussed in the paper. The files are in tab-separated format.3. Files beginning with sentence_... are full sentences for the verbs discussed in the paper (in the plain text format and R dataset format [.rds]). Information of the corpus file and sentence number in which the verb is found are included.4. me_parsed_nountaggedbase (in three different file-formats) contains database of the me- words with noun-tagged root that MorphInd identified to occur in three morphological schemas we focus on (me-, me-/-kan, and me-/-i). The database has columns for the verbs' token frequency in the corpus, root forms, MorphInd parsing output, among others.5. wordcount_leipzig_allcorpus (in three different file-formats) contains information on the size of each corpus file used in the paper and from which the vector space model is built.6. wordlist_leipzig_ME_DI_TER_percorpus.tsv is a tab-separated frequency list of words prefixed with me-, di-, and ter- in all thirteen corpus files used. The wordlist is built by first tokenising each corpus file, lowercasing the tokens, and then extracting the words with the corresponding three prefixes using the following regular expressions: - For me-: ^(?i)(me)([a-z-]{3,})$- For di-: ^(?i)(di)([a-z-]{3,})$- For ter-: ^(?i)(ter)([a-z-]{3,})$7. malindo_dbase is the MALINDO Morphological Dictionary (see above).ReferencesSchmidt, Ben & Jian Li. 2017. wordVectors: Tools for creating and analyzing vector-space models of texts. R package. http://github.com/bmschmidt/wordVectors.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Raffaele Calogero (2020). Supplementary Figure SCA analysis using a manually curated cancer-immune-signature (SCA tutorial) [Dataset]. http://doi.org/10.6084/m9.figshare.12867029.v1

Supplementary Figure SCA analysis using a manually curated cancer-immune-signature (SCA tutorial)

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.12867029.v1

Dataset updated

Aug 26, 2020

Dataset provided by

figshare

Authors

Raffaele Calogero

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset used to assemble Figure SCA analysis using a manually curated cancer-immune-signature in SCAtutorial vignette (https://kendomaniac.github.io/SCAtutorial/articles/SCAvignette.html)

Clear search

Close search

Google apps

Main menu

Supplementary Figure SCA analysis using a manually curated...

Pokemon data mining 2020

Context

https://www.kaggle.com/ajpass/data-mining-web-scrapper-vol-1-pokedex

Content

Acknowledgements

Augmenting geovisual analytics of social media data with heterogeneous...

Visual Data Mining of Biological Networks: One Size Does Not Fit All

Data from: Dataset for Vector space model and the usage patterns of...

Supplementary Figure SCA analysis using a manually curated cancer-immune-signature (SCA tutorial)