6 datasets found

MEDICINA-corpus_reducido+MIR+wiki
kaggle.com
Updated May 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manuel González Martínez (2023). MEDICINA-corpus_reducido+MIR+wiki [Dataset]. https://www.kaggle.com/datasets/manuelgonzlezmartnez/medicina-corpus-reducido-mir-wiki
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 8, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Manuel González Martínez
Description
This datasets contains the tokenized version of a dataset containing 60% of OSCAR spanish corpus, wiki data from multiple countries and medicine books. As the weight is so big i needed to cut the OSCAR corpus to make it a little bit smaller, for the same reason i uploaded the tokenized version as If you want/need to work with this dataset inside kaggle you do not have enough space for tokenizing the dataset.

I have also uploaded the code used for tokenize the dataset.

If you want me to upload the entire dataset divided in 4 parts ask for It. :)
c
Global IT Information Technology Market Report 2025 Edition, Market Size,...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research, Global IT Information Technology Market Report 2025 Edition, Market Size, Share, CAGR, Forecast, Revenue [Dataset]. https://www.cognitivemarketresearch.com/it-information-technology-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, Information Technology Global Market Size was USD XX Million in 2024 and is set to achieve a market size of USD XX Million by the end of 2033 growing at a CAGR of XX% from 2025 to 2033.

North America held largest share of xx% in the year 2024 Europe held share of xx% in the year 2024 Asia-Pacific held significant share of xx% in the year 2024 South America held significant share of xx% in the year 2024 Middle East and Africa held significant share of xx% in the year 2024

Market Dynamics of IT Information Technology Market

Key Drivers of IT Information Technology Market

The Growing Adoption of Cloud Computing, Artificial Intelligence, and Big Data

The extensive incorporation of cutting-edge digital technologies—cloud computing, AI, and big data—serves as a key catalyst for the growth of the IT market. Cloud computing provides businesses with scalable and adaptable infrastructure, AI enhances operational efficiency through automation and predictive analytics, and big data supports informed decision-making. For example, Atera’s collaboration with Azure OpenAI facilitates predictive issue resolution and significantly enhances IT productivity. These technologies are transforming workflows across various industries and driving innovation, ensuring that the IT sector remains at the forefront of global digital transformation.

Source:https://www.microsoft.com/en/customers/story/1662731177894407321-atera-professional-services-azure-en-israel

The Transformative Influence of IoT is Enhancing the Global IT Sector

The rapid proliferation of Internet of Things (IoT) devices—projected to exceed 16.6 billion by the close of 2023—has intensified the demand for IT infrastructure, services, and analytics. IoT fosters real-time data gathering, automation, and predictive maintenance in sectors such as healthcare, manufacturing, and smart cities. The immense data produced by interconnected devices is propelling advancements in AI, cloud computing, and edge computing. With increasing investments in 5G and digital infrastructure, IoT continues to serve as a vital enabler of IT market growth on a global scale.

(Source:https://iot-analytics.com/product/state-of-iot-summer-2024/)

Key Restraints in IT Information Technology Market

Growing Concerns Regarding Data Privacy are Impeding IT Market Expansion

High-profile cyber incidents, such as the 2021 Microsoft Exchange Server breach, have triggered considerable anxiety regarding data security. Consumer apprehensions about surveillance, unauthorized access, and the corporate misuse of personal data are on the rise. According to Deloitte, almost 60% of consumers express concerns about security breaches, with trust in corporate data management notably diminished. This situation has prompted demands for more stringent privacy regulations and may hinder digital adoption due to heightened compliance requirements and public skepticism.

(Source:https://www2.deloitte.com/us/en/insights/industry/telecommunications/connectivity-mobile-trends-survey/2023/data-privacy-and-security.html

https://en.wikipedia.org/wiki/WannaCry_ransomware_attack)

Cybersecurity Threats and the Escalation of Attack Complexity

The emergence of intricate cyber threats, such as ransomware (e.g., WannaCry), poses a persistent challenge for the IT industry. Cybercriminals take advantage of weaknesses in essential systems, leading to financial losses, data breaches, and damage to reputation. Tackling cybersecurity necessitates ongoing investment in threat detection, endpoint security, and adherence to regulations. These evolving threats not only increase operational expenses but also discourage smaller enterprises from adopting advanced IT solutions due to the fear of vulnerability.

Key Trends of IT Information Technology Market

Expansion of Edge Computing to Facilitate Real-Time Applications

As IoT and smart devices become more prevalent, edge computing is gaining traction by processing data nearer to its source. This approach minimizes latency and enhances response times, making it particularly suitable for real-time applications such as autonomous vehicles, smart manufacturing, and augmented reality. The shift towards edge infrastructure is transforming IT architectures to more effectively balance cloud and on-premise computing requirements.

Increase i...
Wikia census / Fandom census
kaggle.com
zip
Updated Oct 19, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abel Serrano Juste (2018). Wikia census / Fandom census [Dataset]. https://www.kaggle.com/abeserra/wikia-census
Explore at:
zip(87833068 bytes)Available download formats
Dataset updated
Oct 19, 2018
Authors
Abel Serrano Juste
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
Context

A census of all the wikis hosted in Wikia (Now renamed to Fandom). A dataset consisting on data of more than 300 thousand wikis, such as: language, topic, number of users, admins, articles, edits, pages, number of users with a certain number of contributions, number of bots, etc.

A study of this data has been presented in the Opensym 2018 conference. You can find the Jupyter notebook code regarding that study under the "Kernels" section.

Content

There are several files of data: - wikia_stats.csv: general data about each wiki. - wikia_stats_users.csv: general data about each wiki + number of human registered users, categorized according to the number of edits in the last 30 days (Users_N). - wikia_stats_users_birthdate.csv: all the data above plus the estimated date of birth.

If you are just looking for the whole dataset corresponding the Wikia census, go for the wikia_stats_users_birthdate.csv file

The other two .txt files contains pairs of (name, url) of the raw index crawled from the Wikia Sitemap, and the corresponding curated index with only the working wikis.

The date of the data collection of this second version is October 2018. First version was February 2018.

The collection of the data has been made using the scripts located here: https://github.com/Grasia/wiki-scripts

The license of the data is not clearly stated by Wikia, because this data is publicly available in their website but they haven't established anything in their license policy.

Acknowledgements

All the data is possible thanks to FANDOM, the company supporting Wikia, and thank to all the contributors to the wikis.

Inspiration

We want to find the patterns that characterizes a healthy and sustainable online community.

Wikia is a huge ecosystem of these communities where small, medium, big as well as young and old community coexist, so it is a perfect scenario to study online collaboration.

License

This data is released under the Creative Commons Attribution-Share Alike License 3.0 (Unported) (CC-BY-SA). Please attribute FANDOM (The company behind Wikia) and me (Abel Serrano Juste) when using this data.
Multiple Single Cell RNA Expressions ARCHS4
kaggle.com
zip
Updated Jul 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2021). Multiple Single Cell RNA Expressions ARCHS4 [Dataset]. https://www.kaggle.com/datasets/alexandervc/multiple-single-cell-rna-expressions-archs4/data
Explore at:
zip(23319014182 bytes)Available download formats
Dataset updated
Jul 25, 2021
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Context

Dataset is downloaded from https://amp.pharm.mssm.edu/archs4/download.html The methods are described in Nature Communications paper: https://www.nature.com/articles/s41467-018-03751-6

The ARCHS4 data provides user-friendly access to multiple gene expression data from the GEO database. (https://www.ncbi.nlm.nih.gov/geo/ ). While in GEO database most of data is stored in raw formats, ARCHS4 provides prepared count matrix expression data. While GEO contains data stored separately for each research paper, ARCHS4 collects all the information in one single matrix. One may consult the main site for further information.

Main data files are in H5 (HD5, Hierarchical Data Format ) file format https://en.wikipedia.org/wiki/Hierarchical_Data_Format It contains expression data, as well as annotation data and futher meta-information. There are several other auxilliary files like TSNE 3d projection (in CSV format) and correlation matrices for genes for human and mouse in feather format.

Content

The main file (for human): human_matrix.h5 - contains data matrix - which is 238522 samples times 35238 genes, as well as, various meta information: gene names, samples information (tissue, etc), references to GEO database id where all the details can be found.

There is also similar data for mouse, csv files with TSNE images, correlation matrices for genes.

Acknowledgements

The ARCHS4 project is by :

'Alexander Lachmann', 'alexander.lachmann@mssm.edu', update: '2020-02-06'
Bitcoin Blockchain Historical Data
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google BigQuery (2019). Bitcoin Blockchain Historical Data [Dataset]. https://www.kaggle.com/bigquery/bitcoin-blockchain
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Blockchain technology, first implemented by Satoshi Nakamoto in 2009 as a core component of Bitcoin, is a distributed, public ledger recording transactions. Its usage allows secure peer-to-peer communication by linking blocks containing hash pointers to a previous block, a timestamp, and transaction data. Bitcoin is a decentralized digital currency (cryptocurrency) which leverages the Blockchain to store transactions in a distributed manner in order to mitigate against flaws in the financial industry.

Nearly ten years after its inception, Bitcoin and other cryptocurrencies experienced an explosion in popular awareness. The value of Bitcoin, on the other hand, has experienced more volatility. Meanwhile, as use cases of Bitcoin and Blockchain grow, mature, and expand, hype and controversy have swirled.

Content

In this dataset, you will have access to information about blockchain blocks and transactions. All historical data are in the bigquery-public-data:crypto_bitcoin dataset. It’s updated it every 10 minutes. The data can be joined with historical prices in kernels. See available similar datasets here: https://www.kaggle.com/datasets?search=bitcoin.

Querying BigQuery tables

You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.crypto_bitcoin.[TABLENAME]. Fork this kernel to get started.

Method & Acknowledgements

Allen Day (Twitter | Medium), Google Cloud Developer Advocate & Colin Bookman, Google Cloud Customer Engineer retrieve data from the Bitcoin network using a custom client available on GitHub that they built with the bitcoinj Java library. Historical data from the origin block to 2018-01-31 were loaded in bulk to two BigQuery tables, blocks_raw and transactions. These tables contain fresh data, as they are now appended when new blocks are broadcast to the Bitcoin network. For additional information visit the Google Cloud Big Data and Machine Learning Blog post "Bitcoin in BigQuery: Blockchain analytics on public data".

Photo by Andre Francois on Unsplash.

Inspiration

How many bitcoins are sent each day?

How many addresses receive bitcoin each day?

Compare transaction volume to historical prices by joining with other available data sources
Google 2020-2025 Stock Market
kaggle.com
zip
Updated Jan 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Negin Moghadasi (2025). Google 2020-2025 Stock Market [Dataset]. https://www.kaggle.com/datasets/negmgh/google-2020-2025-stock-market
Explore at:
zip(23003 bytes)Available download formats
Dataset updated
Jan 13, 2025
Authors
Negin Moghadasi
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Google 2020-2025 Stock Price

Alphabet Inc. is an American multinational technology conglomerate holding company headquartered in Mountain View, California. Alphabet is the world's second-largest technology company by revenue, after Apple, and one of the world's most valuable companies. It was created through a restructuring of Google on October 2, 2015, and became the parent holding company of Google and several former Google subsidiaries. It is considered one of the Big Five American information technology companies, alongside Amazon, Apple, Meta, and Microsoft.

The establishment of Alphabet Inc. was prompted by a desire to make the core Google business "cleaner and more accountable" while allowing greater autonomy to group companies that operate in businesses other than Internet services. Founders Larry Page and Sergey Brin announced their resignation from their executive posts in December 2019, with the CEO role to be filled by Sundar Pichai, who is also the CEO of Google. Page and Brin remain employees, board members, and controlling shareholders of Alphabet Inc.

Source: https://en.wikipedia.org/wiki/Alphabet_Inc.

Information about this dataset

This dataset provides historical data of GOOG. stock (Google). The data is available at a daily level. Currency is USD.

These terms are key indicators in stock market trading and analysis, providing information about a stock's price movements and trading activity over a specific period (e.g., a day, week, or month):

Close Price:

The final price at which a stock trades during a specific trading session (e.g., at the end of the day). This price is often used as a reference point for comparing daily price movements.

Open Price:

The first price at which a stock trades when the market opens for the day. It can be influenced by after-hours trading, news, or economic events.

High Price:

The highest price at which a stock trades during a specific trading session. It shows the maximum value reached by the stock in that period.

Low Price:

The lowest price at which a stock trades during a specific trading session. It represents the minimum value reached by the stock in that period.

Volume:

The total number of shares traded during a specific period. It indicates the level of interest or activity in a stock, with higher volumes often reflecting greater market interest or volatility.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Manuel González Martínez (2023). MEDICINA-corpus_reducido+MIR+wiki [Dataset]. https://www.kaggle.com/datasets/manuelgonzlezmartnez/medicina-corpus-reducido-mir-wiki

MEDICINA-corpus_reducido+MIR+wiki

Spanish corpus + wiki information + med books

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 8, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Manuel González Martínez

Description

This datasets contains the tokenized version of a dataset containing 60% of OSCAR spanish corpus, wiki data from multiple countries and medicine books. As the weight is so big i needed to cut the OSCAR corpus to make it a little bit smaller, for the same reason i uploaded the tokenized version as If you want/need to work with this dataset inside kaggle you do not have enough space for tokenizing the dataset.

I have also uploaded the code used for tokenize the dataset.

If you want me to upload the entire dataset divided in 4 parts ask for It. :)

Clear search

Close search

Google apps

Main menu

MEDICINA-corpus_reducido+MIR+wiki

Global IT Information Technology Market Report 2025 Edition, Market Size,...

Wikia census / Fandom census

Context

Content

Acknowledgements

Inspiration

License

Multiple Single Cell RNA Expressions ARCHS4

Context

Content

Acknowledgements

Bitcoin Blockchain Historical Data

Context

Content

Querying BigQuery tables

Method & Acknowledgements

Inspiration

Google 2020-2025 Stock Market

Google 2020-2025 Stock Price

Information about this dataset

Close Price:

Open Price:

High Price:

Low Price:

Volume:

MEDICINA-corpus_reducido+MIR+wiki

Spanish corpus + wiki information + med books