Facebook
TwitterCollected COVID-19 datasets from various sources as part of DAAN-888 course, Penn State, Spring 2022. Collaborators: Mohamed Abdelgayed, Heather Beckwith, Mayank Sharma, Suradech Kongkiatpaiboon, and Alex Stroud
**1 - COVID-19 Data in the United States ** Source: The data is collected from multiple public health official sources by NY Times journalists and compiled in one single file. Description: Daily count of new COVID-19 cases and deaths for each state. Data is updated daily and runs from 1/21/2020 to 2/4/2022. URL: https://github.com/nytimes/covid-19-data/blob/master/us-states.csv Data size: 38,814 row and 5 columns.
**2 - Mask-Wearing Survey Data ** Source: The New York Times is releasing estimates of mask usage by county in the United States. Description: This data comes from a large number of interviews conducted online by the global data and survey firm Dynata, at the request of The New York Times. The firm asked a question about mask usage to obtain 250,000 survey responses between July 2 and July 14, enough data to provide estimates more detailed than the state level. URL: https://github.com/nytimes/covid-19-data/blob/master/mask-use/mask-use-by-county.csv Data size: 3,142 rows and 6 columns
**3a - Vaccine Data – Global **
Source: This data comes from the US Centers for Disease Control and Prevention (CDC), Our World in Data (OWiD) and the World Health Organization (WHO).
Description: Time series data of vaccine doses administered and the number of fully and partially vaccinated people by country. This data was last updated on February 3, 2022
URL: https://github.com/govex/COVID-19/blob/master/data_tables/vaccine_data/global_data/time_series_covid19_vaccine_global.csv
Data Size: 162,521 rows and 8 columns
**3b -Vaccine Data – United States **
Source: The data is comprised of individual State's public dashboards and data from the US Centers for Disease Control and Prevention (CDC).
Description: Time series data of the total vaccine doses shipped and administered by manufacturer, the dose number (first or second) by state. This data was last updated on February 3, 2022.
URL: https://github.com/govex/COVID-19/blob/master/data_tables/vaccine_data/us_data/time_series/vaccine_data_us_timeline.csv
Data Size: 141,503 rows and 13 columns
**4 - Testing Data **
Source: The data is comprised of individual State's public dashboards and data from the U.S. Department of Health & Human Services.
Description: Time series data of total tests administered by county and state. This data was last updated on January 25, 2022.
URL: https://github.com/govex/COVID-19/blob/master/data_tables/testing_data/county_time_series_covid19_US.csv
Data size: 322,154 rows and 8 columns
**5 – US State and Territorial Public Mask Mandates ** Source: Data from state and territory executive orders, administrative orders, resolutions, and proclamations is gathered from government websites and cataloged and coded by one coder using Microsoft Excel, with quality checking provided by one or more other coders. Description: US State and Territorial Public Mask Mandates from April 10, 2020 through August 15, 2021 by County by Day URL: https://data.cdc.gov/Policy-Surveillance/U-S-State-and-Territorial-Public-Mask-Mandates-Fro/62d6-pm5i Data Size: 1,593,869 rows and 10 columns
**6 – Case Counts & Transmission Level **
Source: This open-source dataset contains seven data items that describe community transmission levels across all counties. This dataset provides the same numbers used to show transmission maps on the COVID Data Tracker and contains reported daily transmission levels at the county level. The dataset is updated every day to include the most current day's data. The calculating procedures below are used to adjust the transmission level to low, moderate, considerable, or high.
Description: US State and County case counts and transmission level from 16-Aug-2021 to 03-Feb-2022
URL: https://data.cdc.gov/Public-Health-Surveillance/United-States-COVID-19-County-Level-of-Community-T/8396-v7yb
Data Size: 550,702 rows and 7 columns
**7 - World Cases & Vaccination Counts **
Source: This is an open-source dataset collected and maintained by Our World in Data. OWID provides research and data to help against the world’s largest problems.
Description: This dataset includes vaccinations, tests & positivity, hospital & ICU, confirmed cases, confirmed deaths, reproduction rate, policy responses and other variables of interest.
URL: https://github.com/owid/covid-19-data/tree/master/public/data
Data Size: 67 columns and 157,000 rows
**8 - COVID-19 Data in the European Union **
Source: This is an open-source dataset collected and maintained by ECDC. It is an EU agency aimed at strengthening Europe's defenses against infectious diseases.
Description: This dataset co...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In experiment 4a the first five rows of data indicate the proportion of times participants perceived an unbound test array as being larger than a fixed unbound reference array. The second five rows of indicate the proportion of times participants perceived an array with a line connecting the local elements as being larger than a fixed unbound reference array. Each proportion was calculated from 20 trials.In experiment 4b the first five rows of data indicate the proportion of times participants perceived an unbound test array as being larger than a fixed unbound reference array. The second five rows of indicate the proportion of times participants perceived an array with a line intersecting only the interiors of the elements as being larger than a fixed unbound reference array. Each proportion was calculated from 20 trials.
Facebook
Twitterhttps://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Data Center Construction Market Size 2025-2029
The data center construction market size is valued to increase USD 41 billion, at a CAGR of 8.8% from 2024 to 2029. Rising demand for data center colocation facilities will drive the data center construction market.
Major Market Trends & Insights
Europe dominated the market and accounted for a 32% growth during the forecast period.
By Application - Enterprise segment was valued at USD 23.20 billion in 2023
By Type - Electrical construction segment accounted for the largest market revenue share in 2023
Market Size & Forecast
Market Opportunities: USD 70.71 billion
Market Future Opportunities: USD 41.00 billion
CAGR : 8.8%
Europe: Largest market in 2023
Market Summary
The market is a dynamic and continuously evolving sector, driven by the rising demand for colocation facilities and the growing focus on constructing energy-efficient, or 'green,' data centers. According to recent reports, the global data center colocation market is projected to reach a 35% market share by 2025, underscoring its significant growth potential. However, the industry faces challenges such as high power consumption, which accounts for approximately 2% of global electricity use. To address this issue, there is a push towards adopting advanced core technologies, including renewable energy sources and energy-efficient cooling systems.
Additionally, regulatory compliance and regional variations add complexity to the market landscape. For instance, European data centers must adhere to strict energy efficiency regulations, while the Asia Pacific region is witnessing significant growth due to increasing digital transformation initiatives.
What will be the Size of the Data Center Construction Market during the forecast period?
Get Key Insights on Market Forecast (PDF) Request Free Sample
How is the Data Center Construction Market Segmented and what are the key trends of market segmentation?
The data center construction industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
Application
Enterprise
Cloud
Colocation
Hyperscale
Type
Electrical construction
Mechanical construction
General construction
Geography
North America
US
Canada
Europe
France
Germany
Italy
UK
APAC
China
Japan
South Korea
South America
Brazil
Rest of World (ROW)
By Application Insights
The enterprise segment is estimated to witness significant growth during the forecast period.
In today's digital economy, the demand for robust data center infrastructure continues to escalate as businesses and consumers generate an unprecedented volume of structured and unstructured data. Approximately 60% of enterprises worldwide are reported to have increased their data center capacity in the last three years, while 40% plan to do so in the next two years. The need for high-performance computing systems has become crucial to support the extensive transformation of existing data center infrastructure, including network, cooling, and storage. Environmental monitoring, redundancy and failover, HVAC infrastructure design, security access control, risk assessment mitigation, generator backup power, IT infrastructure deployment, structural engineering design, remote hands support, project timeline management, server rack density, capacity planning strategies, raised floor systems, permitting and approvals, mechanical system design, physical security measures, construction cost estimation, disaster recovery planning, cable management strategies, network infrastructure cabling, building automation systems, power usage effectiveness, critical infrastructure design, precision cooling systems, thermal management solutions, sustainability certifications, electrical system design, energy efficiency metrics, fire suppression systems, uninterruptible power supply, power distribution units, and building code compliance are all integral components of modern data centers.
Request Free Sample
The Enterprise segment was valued at USD 23.20 billion in 2019 and showed a gradual increase during the forecast period.
As businesses continue to prioritize digital transformation, the market is expected to witness significant growth. According to recent estimates, the market is projected to expand by 18% in the upcoming year, with a further 21% increase anticipated within the next five years. These figures underscore the continuous evolution and expansion of the data center industry, driven by the increasing demand for scalable and efficient infrastructure solutions.
Request Free Sample
Regional Analysis
Europe is estimated to contribute 32% to the growth of the global marke
Facebook
TwitterABSTRACT The variability within rows of cultivation may reduce the accuracy of experiments conducted in a complete randomized block design if the rows are considered as blocks, however, little is known about this variability in protected environments. Thus, our aim was to study the variability of the fresh mass in lettuce shoot, growing in protected environment, and to verify the border effect and size of the experimental unit in minimizing the productive variability. Data from two uniformity trials carried out in a greenhouse in autumn and spring growing seasons were used. In the statistical analyses, it was considered the existence of parallel cultivation rows the lateral openings of the greenhouse and of columns perpendicular to these openings. Different scenarios were simulated by excluding rows and columns to generate several borders arrangements and also to use different sizes of the experimental unit. For each scenario, homogeneity test of variances between remaining rows and columns was performed, and it was calculated the variance and coefficient of variation. There is variability among rows in trials with lettuce in plastic greenhouses and the border use does not bring benefits in terms of reduction of the coefficient of variation or minimizing the cases of heterogeneous variances among rows. In experiments with lettuce in a plastic greenhouse, the use of an experimental unit size greater than or equal to two plants provides homogeneity of variances among rows and columns and, therefore, allows the use of a completely randomized design.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This is a condensed version of the raw data obtained through the Google Data Analytics Course, made available by Lyft and the City of Chicago under this license (https://ride.divvybikes.com/data-license-agreement).
I originally did my study in another platform, and the original files were too large to upload to Posit Cloud in full. Each of the 12 monthly files contained anywhere from 100k to 800k rows. Therefore, I decided to reduce the number of rows drastically by performing grouping, summaries, and thoughtful omissions in Excel for each csv file. What I have uploaded here is the result of that process.
Data is grouped by: month, day, rider_type, bike_type, and time_of_day. total_rides represent the sum of the data in each grouping as well as the total number of rows that were combined to make the new summarized row, avg_ride_length is the calculated average of all data in each grouping.
Be sure that you use weighted averages if you want to calculate the mean of avg_ride_length for different subgroups as the values in this file are already averages of the summarized groups. You can include the total_rides value in your weighted average calculation to weigh properly.
date - year, month, and day in date format - includes all days in 2022 day_of_week - Actual day of week as character. Set up a new sort order if needed. rider_type - values are either 'casual', those who pay per ride, or 'member', for riders who have annual memberships. bike_type - Values are 'classic' (non-electric, traditional bikes), or 'electric' (e-bikes). time_of_day - this divides the day into 6 equal time frames, 4 hours each, starting at 12AM. Each individual ride was placed into one of these time frames using the time they STARTED their rides, even if the ride was long enough to end in a later time frame. This column was added to help summarize the original dataset. total_rides - Count of all individual rides in each grouping (row). This column was added to help summarize the original dataset. avg_ride_length - The calculated average of all rides in each grouping (row). Look to total_rides to know how many original rides length values were included in this average. This column was added to help summarize the original dataset. min_ride_length - Minimum ride length of all rides in each grouping (row). This column was added to help summarize the original dataset. max_ride_length - Maximum ride length of all rides in each grouping (row). This column was added to help summarize the original dataset.
Please note: the time_of_day column has inconsistent spacing. Use mutate(time_of_day = gsub(" ", "", time_of _day)) to remove all spaces.
Below is the list of revisions I made in Excel before uploading the final csv files to the R environment:
Deleted station location columns and lat/long as much of this data was already missing.
Deleted ride id column since each observation was unique and I would not be joining with another table on this variable.
Deleted rows pertaining to "docked bikes" since there were no member entries for this type and I could not compare member vs casual rider data. I also received no information in the project details about what constitutes a "docked" bike.
Used ride start time and end time to calculate a new column called ride_length (by subtracting), and deleted all rows with 0 and 1 minute results, which were explained in the project outline as being related to staff tasks rather than users. An example would be taking a bike out of rotation for maintenance.
Placed start time into a range of times (time_of_day) in order to group more observations while maintaining general time data. time_of_day now represents a time frame when the bike ride BEGAN. I created six 4-hour time frames, beginning at 12AM.
Added a Day of Week column, with Sunday = 1 and Saturday = 7, then changed from numbers to the actual day names.
Used pivot tables to group total_rides, avg_ride_length, min_ride_length, and max_ride_length by date, rider_type, bike_type, and time_of_day.
Combined into one csv file with all months, containing less than 9,000 rows (instead of several million)
Facebook
TwitterThis query returns the total number of assets published on the Enterprise Data Platform and the total number of rows, columns and values published in datasets.
Facebook
Twitterhttps://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Numpy tensors to train and test a convolutional neural network dedicated to determine crystallite size and/or microstrain from X-ray diffraction data (XRD): train_size.npz: training dataset with only crystallite size test_size.npz: testing dataset with only crystallite size train_size_strain.npz: training dataset with crystallite size and microstrain test_size_strain.npz: testing dataset with crystallite size and microstrain Each dataset contains the XRD data and the labels ("ground truth") in the form of 2D tensors with 10501 data points (columns) for the XRD data, and 24 labels (columns) for the labels. Training data contain 71971 rows ; testing data contain 7997 rows. Example python script to read the data: import numpy as np train = np.load("train_size.npz") train_data, train_label = train["train_data"], train["train_label"] print(f"Train data shape: {train_data.shape}, Train labels shape: {train_label.shape}") Jupyter notebooks to train and test a neural network can be found here: https://github.com/aboulle/LPA-NN
Facebook
TwitterC Insights Africa's company database contains details of more than 10,000 organizations in Nigeria ranging from the large corporates, to the mid-sized and small companies. Our database contains attributes such as company size, address(s), contact details, type of business and related companies (where applicable). Marketing and sales executives can enrich their pipeline with our database, while business development teams or C-suite executives interested in finding new partners/frontiers are sure to find this database invaluable.
Facebook
TwitterA within-species trade-off between growth rates and lifespan has been observed across different taxa of trees, however, there is some uncertainty whether this trade-off also applies to shade-intolerant tree species. The main objective of this study was to investigate the relationships between radial growth, tree size and lifespan of shade-intolerant mountain pines. For 200 dead standing mountain pines (Pinus montana) located along gradients of aspect, slope steepness and elevation in the Swiss National Park, radial annual growth rates and lifespan were reconstructed. While early growth (i.e. mean tree-ring width over the first 50 years) correlated positively with diameter at the time of tree death, a negative correlation resulted with lifespan, i.e. rapidly growing mountain pines face a trade-off between reaching a large diameter at the cost of early tree death. Slowly growing mountain pines may reach a large diameter and a long lifespan, but risk to die young at a small size. Early gro...
Facebook
Twitterhttps://wemarketresearch.com/privacy-policyhttps://wemarketresearch.com/privacy-policy
The Bioinformatics Services Market will grow from $4.3B in 2025 to $15.7B by 2035, at a CAGR of 12.6%, driven by rising demand for biologics and biosimilars.
| Report Attribute | Description |
|---|---|
| Market Size in 2025 | USD 4.3 Billion |
| Market Forecast in 2035 | USD 15.7 Billion |
| CAGR % 2025-2035 | 12.6% |
| Base Year | 2024 |
| Historic Data | 2020-2024 |
| Forecast Period | 2025-2035 |
| Report USP | Production, Consumption, company share, company heatmap, company production capacity, growth factors and more |
| Segments Covered | By Service Type, By Application, By End-user |
| Regional Scope | North America, Europe, APAC, Latin America, Middle East and Africa |
| Country Scope | U.S., Canada, U.K., Germany, France, Italy, Spain, Benelux, Nordic Countries, Russia, China, India, Japan, South Korea, Australia, Indonesia, Thailand, Mexico, Brazil, Argentina, Saudi Arabia, UAE, Egypt, South Africa, Nigeria |
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is a cleaned version of the Chicago Crime Dataset, which can be found here. All rights for the dataset go to the original owners. The purpose of this dataset is to display my skills in visualizations and creating dashboards. To be specific, I will attempt to create a dashboard that will allow users to see metrics for a specific crime within a given year using filters and metrics. Due to this, there will not be much of a focus on the analysis of the data, but there will be portions discussing the validity of the dataset, the steps I took to clean the data, and how I organized it. The cleaned datasets can be found below, the Query (which utilized BigQuery) can be found here and the Tableau dashboard can be found here.
The dataset comes directly from the City of Chicago's website under the page "City Data Catalog." The data is gathered directly from the Chicago Police's CLEAR (Citizen Law Enforcement Analysis and Reporting) and is updated daily to present the information accurately. This means that a crime on a specific date may be changed to better display the case. The dataset represents crimes starting all the way from 2001 to seven days prior to today's date.
Using the ROCCC method, we can see that: * The data has high reliability: The data covers the entirety of Chicago from a little over 2 decades. It covers all the wards within Chicago and even gives the street names. While we may not have an idea for how big the sample size is, I do believe that the dataset has high reliability since it geographically covers the entirety of Chicago. * The data has high originality: The dataset was gained directly from the Chicago Police Dept. using their database, so we can say this dataset is original. * The data is somewhat comprehensive: While we do have important information such as the types of crimes committed and their geographic location, I do not think this gives us proper insights as to why these crimes take place. We can pinpoint the location of the crime, but we are limited by the information we have. How hot was the day of the crime? Did the crime take place in a neighborhood with low-income? I believe that these key factors prevent us from getting proper insights as to why these crimes take place, so I would say that this dataset is subpar with how comprehensive it is. * The data is current: The dataset is updated frequently to display crimes that took place seven days prior to today's date and may even update past crimes as more information comes to light. Due to the frequent updates, I do believe the data is current. * The data is cited: As mentioned prior, the data is collected directly from the polices CLEAR system, so we can say that the data is cited.
The purpose of this step is to clean the dataset such that there are no outliers in the dashboard. To do this, we are going to do the following: * Check for any null values and determine whether we should remove them. * Update any values where there may be typos. * Check for outliers and determine if we should remove them.
The following steps will be explained in the code segments below. (I used BigQuery for this so the coding will follow BigQuery's syntax) ```
SELECT
*
FROM
portfolioproject-350601.ChicagoCrime.Crime
LIMIT 1000;
SELECT
*
FROM
portfolioproject-350601.ChicagoCrime.Crime
WHERE
unique_key IS NULL OR
case_number IS NULL OR
date IS NULL OR
primary_type IS NULL OR
location_description IS NULL OR
arrest IS NULL OR
longitude IS NULL OR
latitude IS NULL;
DELETE FROM
portfolioproject-350601.ChicagoCrime.Crime
WHERE
unique_key IS NULL OR
case_number IS NULL OR
date IS NULL OR
primary_type IS NULL OR
location_description IS NULL OR
arrest IS NULL OR
longitude IS NULL OR
latitude IS NULL;
SELECT unique_key, COUNT(unique_key) FROM `portfolioproject-350601.ChicagoCrime....
Facebook
Twitterhttps://www.futuremarketinsights.com/privacy-policyhttps://www.futuremarketinsights.com/privacy-policy
Organizations have been overwhelmed with vast amounts of data generated from various sources, such as enterprise applications, IoT devices, social media platforms, as well as cloud services. Effectively harnessing this data to drive business insights and innovation has become a critical imperative for organizations seeking to maintain competitiveness and relevance in their respective industries.
| Attributes | Key Insights |
|---|---|
| Data Orchestration Tool Market Estimated Size in 2024 | US$ 1.3 billion |
| Projected Market Value in 2034 | US$ 4.3 billion |
| Value-based CAGR from 2024 to 2034 | 12.1% |
Country-wise Insights
| Country | The United States |
|---|---|
| CAGR through 2034 | 8.1% |
| Country | Germany |
|---|---|
| CAGR through 2034 | 5.3% |
| Country | China |
|---|---|
| CAGR through 2034 | 12.6% |
| Country | Japan |
|---|---|
| CAGR through 2034 | 4.2% |
| Country | Australia and New Zealand |
|---|---|
| CAGR through 2034 | 7.3% |
Category-wise Insights
| Category | Shares in 2024 |
|---|---|
| Cloud Based | 62.3% |
| Telecommunications | 24.2% |
Report Scope
| Attribute | Details |
|---|---|
| Estimated Market Size in 2024 | US$ 1.3 billion |
| Projected Market Valuation in 2034 | US$ 4.3 billion |
| Value-based CAGR 2024 to 2034 | 12.1% |
| Forecast Period | 2024 to 2034 |
| Historical Data Available for | 2019 to 2023 |
| Market Analysis | Value in US$ Billion |
| Key Regions Covered |
|
| Key Market Segments Covered |
|
| Key Countries Profiled |
|
| Key Companies Profiled |
|
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was compiled with the ultimate goal of developing non-invasive computer vision algorithms for assessing shrimp biometrics and biomass estimation. The main folder, labeled "DATASET," contains five sub-folders—DB1, DB2, DB3, DB4, and DB5—each filled with images of shrimps. Additionally, each sub-folder is accompanied by an Excel file that includes manually measured data for the shrimps pictured. The files are named respectively: DB1_INDUSTRIAL_FARM_1, DB2_INDUSTRIAL_FARM_2_C1, DB3_INDUSTRIAL_FARM_2_C2, DB4_ACADEMIC_POND_S1, and DB5_ACADEMIC_POND_S2.
Here’s a detailed description of the contents of each sub-folder and its corresponding Excel file:
1) DB1 includes 490 PNG images of 22 shrimps taken from one pond at an industrial farm. The associated Excel file, DB1_INDUSTRIAL_FARM_1, contains columns for: SAMPLE: Reflecting the number of individual shrimps (22 entries or rows). LENGTH (cm): Measuring from the rostrum (near the eyes) to the start of the tail. WEIGHT (g): Recorded using a scale. COMPLETE SHRIMP IMAGES: Indicates if at least one full-body image is available (1) or not (0).
2) DB2 consists of 2002 PNG images of 58 shrimps. The Excel file, DB2_INDUSTRIAL_FARM_2_C1, includes: SAMPLE: Number of shrimps (58 entries or rows). CEPHALOTHORAX (cm): Total length of the cephalothorax. LENGTH (cm) and WEIGHT (g): Similar measurements as DB1. COMPLETE SHRIMP IMAGES: Presence (1) or absence (0) of full-body images.
3) DB3 contains 1719 PNG images of 50 shrimps, with its Excel file, DB3_INDUSTRIAL_FARM_2_C2, documenting: SAMPLE: Number of shrimps (50 entries or rows). Measurements and categories identical to DB2.
4) DB4 encompasses 635 PNG images of 20 shrimps, detailed in the Excel file DB4_ACADEMIC_POND_S1. This includes: SAMPLE: Number of shrimps (20 entries or rows). CEPHALOTHORAX (cm), LENGTH (cm), WEIGHT (g), and COMPLETE SHRIMP IMAGES: Documented as in other datasets.
5) DB5 includes 661 PNG images of 20 shrimps, with DB5_ACADEMIC_POND_S2 as the corresponding Excel file. The file mirrors the structure and measurements of DB4.
The images for each foler are named "sm_n", where m is the number of shrimp sample and n is the number of picture of that shrimp. This carefully structured dataset provides comprehensive biometric data on shrimps, facilitating the development of algorithms aimed at non-invasive measurement techniques. This will likely be pivotal in enhancing the precision of biomass estimation in aquaculture farming, utilizing advanced statistical morphology analysis and machine learning techniques.
CHANGES FROM VERSION 1:
The cephalothorax metric is the length rather than the width. That was an error in the first version. The name in the columns also had a typo, which has been corrected (from CEPHALOTORAX to CEPHALOTHORAX).
Facebook
Twitterhttps://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The Data Center Cooling market is experiencing robust growth, projected to reach $1452.12 million in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 6.78% from 2025 to 2033. This expansion is fueled by several key factors. The increasing density of data centers, driven by the exponential growth of data generated globally, necessitates advanced cooling solutions to prevent overheating and ensure optimal performance. Furthermore, rising energy costs and growing concerns about environmental sustainability are pushing the adoption of energy-efficient cooling technologies like liquid cooling and adiabatic cooling systems. The market is segmented by cooling type, with room-cooling, rack-cooling, and row-cooling solutions catering to diverse data center needs and sizes. Leading companies are aggressively pursuing innovative strategies, including mergers and acquisitions, strategic partnerships, and research and development investments, to strengthen their market positions and capitalize on this burgeoning market. Geographic expansion, particularly in rapidly developing economies in Asia-Pacific and other regions with increasing data center deployments, presents significant growth opportunities. However, challenges such as high initial investment costs associated with advanced cooling systems and the need for skilled professionals to manage and maintain these complex technologies may act as restraints. The competitive landscape is marked by the presence of both established players and emerging technology companies. Major players like 3M, Daikin, Schneider Electric, and Vertiv are leveraging their technological expertise and extensive distribution networks to maintain their dominance. Meanwhile, smaller, innovative companies are introducing niche solutions and challenging the incumbents. The market's future growth trajectory hinges on technological advancements, the evolution of data center designs, and the ongoing demand for environmentally sustainable cooling solutions. The consistent need for reliable, energy-efficient, and scalable cooling infrastructure will be the primary driver of this market's continued expansion throughout the forecast period.
Facebook
Twitterhttps://www.futuremarketinsights.com/privacy-policyhttps://www.futuremarketinsights.com/privacy-policy
The global mobile payment data protection market size is anticipated to experience notable growth of USD 7,30,843.3 million in 2024 from USD 6,59,096.1 million in 2023. The industry is foreseen to sustain expansion with outstanding numbers of USD 23,66,892.7 million by 2034, with a CAGR of 12.5% through 2034.
| Attributes | Description |
|---|---|
| Estimated Global Mobile Payment Data Protection Market Size, 2024 | USD 7,30,843.3 million |
| Projected Global Mobile Payment Data Protection Market Size, 2034 | USD 23,66,892.7 million |
| Value-based CAGR (2024 to 2034) | 12.5% CAGR |
Semi Annual Market Update
| Particular | Value CAGR |
|---|---|
| H1 | 9.8% (2023 to 2033) |
| H2 | 10.2% (2023 to 2033) |
| H1 | 10% (2024 to 2034) |
| H2 | 10.2% (2024 to 2034) |
Country-wise Insights
| Countries | CAGR from 2024 to 2034 |
|---|---|
| Australia | 16% |
| China | 13% |
| United States | 9.3% |
| Germany | 7.9% |
| Japan | 7.2% |
Category-wise Insights
| Segment | Contactless Tokenisation (Product) |
|---|---|
| Value Share (2024) | 56.2% |
| Segment | Banking and Financial Service (End User) |
|---|---|
| Value Share (2024) | 33.7% |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Synthetic data correspond to the ENCODE data for cell lines HepG2 (https://www.encodeproject.org/biosamples/ENCBS282XVK/) and K562 (https://www.encodeproject.org/biosamples/ENCBS023XVB/). The data and networks were generated using GeneSPIDER (publicly available at https://bitbucket.org/sonnhammergrni/genespider/).
Table.1 Description of the files
| data_HepG2like_SNR_L=0.0054699_diff=1.6188e-05.txt | Synthetic gene expression knockdown (shRNA-seq) data immitating the ENCODE data for HepG2 cell line. Data size: 232 RBPs vs 464 experiments (2 replicates). SNR_L is the value of signal to noise ratio. Difference (diff) value tells the difference between replicate correlation coefficients of real and synthetic ENCODE data. Columns represent experiments, rows represent genes. |
| data_K562like_SNR_L=0.0028692_diff=0.00017339.txt | Synthetic gene expression knockdown (shRNA-seq) data immitating the ENCODE data for K562 cell line. Data size: 232 RBPs vs 464 experiments (2 replicates). SNR_L is the value of signal to noise ratio. Difference (diff) value tells the difference between replicate correlation coefficients of real and synthetic ENCODE data. Columns represent experiments, rows represent genes. |
| network_HEPG2like_sparsity4.txt | Synthetic scale-free gene regulatory network compatibile with data_HepG2like_SNR_L=0.0054699_diff=1.6188e-05.txt. Sparsity (average node degree) is 4 including selfloops. Direction should be read from columns to rows. |
| network_K562like_sparsity4.txt | Synthetic scale-free gene regulatory network compatibile with data_K562like_SNR_L=0.0028692_diff=0.00017339.txt. Sparsity (average node degree) is 4 including selfloops. Direction should be read from columns to rows. |
| perturbations_HepG2&K562_2replicates.txt | Perturbation matrix including information about knockeddown RBPs. Data size: 232 RBPs vs 464 experiments (2 replicates). |
Created by Garbulowski et al. (2024) as a part of the work entitled "Comprehensive analysis of the RBP regulome reveals functional modules and drug candidates in liver cancer"
Facebook
TwitterThis dataset is designed for beginners to practice regression problems, particularly in the context of predicting house prices. It contains 1000 rows, with each row representing a house and various attributes that influence its price. The dataset is well-suited for learning basic to intermediate-level regression modeling techniques.
Beginner Regression Projects: This dataset can be used to practice building regression models such as Linear Regression, Decision Trees, or Random Forests. The target variable (house price) is continuous, making this an ideal problem for supervised learning techniques.
Feature Engineering Practice: Learners can create new features by combining existing ones, such as the price per square foot or age of the house, providing an opportunity to experiment with feature transformations.
Exploratory Data Analysis (EDA): You can explore how different features (e.g., square footage, number of bedrooms) correlate with the target variable, making it a great dataset for learning about data visualization and summary statistics.
Model Evaluation: The dataset allows for various model evaluation techniques such as cross-validation, R-squared, and Mean Absolute Error (MAE). These metrics can be used to compare the effectiveness of different models.
The dataset is highly versatile for a range of machine learning tasks. You can apply simple linear models to predict house prices based on one or two features, or use more complex models like Random Forest or Gradient Boosting Machines to understand interactions between variables.
It can also be used for dimensionality reduction techniques like PCA or to practice handling categorical variables (e.g., neighborhood quality) through encoding techniques like one-hot encoding.
This dataset is ideal for anyone wanting to gain practical experience in building regression models while working with real-world features.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Version 1.3, updated 11/15/2024.
Added a file with 27 regional dust sample mineral composition information 'NewRegionalSamples.xlsx',
along with the refractive index data.
All refractive index files here have 127 rows (wavelengths) and 27 columns (samples)
'kall27_coarse.dat' is the imaginary part of the coarse mode.
'kall27_fine.dat' is the imaginary part of the fine mode.
'nall27_coarse.dat' is the real part of the coarse mode.
'nall27_fine.dat' is the real part of the fine mode.
Version 1.2, updated 04/23/2024.Major changes: Changed all the data file names to new format: "mix"+{property name}+{number}, rearranged the number of mixing samples
Updated all the bulk optical property data. This version use constant values of standard deviation in the lognormal size distribution settings for the coarse mode and the fine mode respectively.
The phase matrices are separated from the other bulk properties due to their large file sizes. The readme file is updated correspondingly. The information of scattering angles (498 angles in total) is uploaded as "TAMUdust2020_Angle.dat".
Added supplemental file data in 'Supplemental.tar.gz'.
Additional refractive indices are zipped in 'AdditionalRefInd.tar.gz'
Version 1.1, updated 03/14/2024.Major changes: Added mixed bulk properties for "0 (99%coarse+1%fine)" and "11 (2.0 µm coarse+ 0.4 µm fine)";Added "reff.dat" in the 'BulkProperties.tar.gz'. The data include four columns: fine mode fraction, bulk projected area , bulk volume , effective radius r_eff. The information is for mixed sample number 0 to 11, each corresponds to one row.Added refractive indices for chlorite, mica, smectite, pyroxene, vermiculite and pyroxenes. These groups can be applied in some other models.
Version 1.0, uploaded 01/02/2024.
This database include supplemental data and files for the publication of this paper:
Sensitivities of Spectral Optical Properties of Dust Aerosols to their Mineralogical and Microphysical Properties. Yuheng Zhang, M. Saito, P. Yang, G. L. Schuster, and C. R. Trepte, J. Geophys. Res. Atmos. 2024.
The supplemental data include:
1) 'GroupRefInd.tar.gz' Mineral (group) refractive index files.E. g., 1All_Illite.dat contains the complex refractive index files of illite group. Format (from left to right columns): Wavelength (unit: µm), Real part (n), Imaginary part (k), standard deviation of n, standard deviation of k.
The file 'fine_log.dat' includes the mean and standard deviation values of n and k for all the generated fine mode dust samples at 11,044 wavelengths from 0.2 to 50 micron.
The file 'fine_log127.dat' only includes the values at 127 wavelengths from 0.2 to 50 micron (defined in 'swav.txt' and 'lwav.txt'), and is used for the bulk property computations.
The files 'coarse_log.dat' and 'coarse_log127.dat' are for the coarse mode dust samples.
2) 'CompositionFraction.xlsx': Mineral composition data sources/references and composition data (mean and standard deviation values of each group).'Vlog_coarse.dat': Randomly generated VOLUME FRACTION of 9 mineral groups for the coarse mode dust. Left to right: Illite, Kaolinite, Montmorillonite (Other clays), Quartz, Feldspar, Carbonate, Gypsum (Sulphate), Hematite, Goethite.
'Vlog_fine.dat': For the fine mode dust.
3) 'RefSources.xlsx': The data source references of mineral refractive indices. We didn't include the olivine, other silicates, soot and titanium-rich minerals in the paper, but the refractive indices are available for those who are interested. Chlorite, Mica and Vermiculite group are mentioned in some studies, and we included the refractive indices for these minerals as well.
4) 'DustSamples.tar.gz' Dust sample refractive index files.The files are enclosed in four folders: fine_sw/ fine_lw/ coarse_sw/ coarse_lw/.
fine: fine mode. coarse: coarse mode.
'sw' means shortwave (< 4 µm, in total 76 wavelengths defined in 'swav.txt') while 'lw' means longwave (>= 4 µm, in total 51 wavelengths defined in 'lwav.txt').
All files start with 'rdn', which means that they are computed based on randomly generated composition (data given in sheet 2 of 'CompositionFraction.xlsx').
The four digit number after 'rdn' is the index of each dust sample. In total, there are 5,000 samples. The sample composition is the same for the same sample index in the same size mode (fine/coarse). Data file format (from left to right columns): real part, imaginary part.
5) 'BulkProperties.tar.gz' Bulk property files (excluding phase matrices)'mixqx.dat' files format (from left to right columns): Extinction efficiency (Qext), Scattering efficiency (Qsca), Backscattering efficiency (Qbck), and Asymmetry coefficient (Qasy). To obtain asymmetry factor, use Qasy/Qsca.
'mixbkx.dat' files format (from left to right columns): P11(pi) P12(pi) P22(pi) P33(pi) P34(pi) P44(pi).
'x' refers to the number at the end of the file name. It can be 100 ~ 112, each represents a setting of coarse and fine mode effective radius and volume fraction (see details in "reff.dat")
'reff.dat' contains the effective radius information of the mixture. It has 7 columns: File number "x", Fine mode volume fraction, Fine mode effective radius (µm), Coarse mode effective radius (µm), Bulk projected area (µm^2), Bulk volume (µm^3), Bulk effective radius (µm).
6) 'PhaseMatrices.tar.gz' Phase matrices data'mixphswx.dat' files contain phase matrix results at 532 nm (shortwave). From left to right: P11, P12, P22, P33, P34, P44.
'mixphlwx.dat' files contain phase matrix results at 10.5 µm (longwave).
There are 635,000 rows in each data file. 635,000 rows = 127 wavelengths * 5,000 samples. Row 1~127 is sample 1, row 128~254 is sample 2, etc.. Suggest to use matlab function 'reshape(property, 127, 5000)' for each column when processing the data.
7) 'Supplemental.tar.gz'
We also include data files mentioned in the supplemental file of the paper. The adjusted source data files of the nine mineral groups are included.
The supplemental bulk property files are named based on the figure number.
8) 'AdditionalRefInd.tar.gz'
We also include additional refractive indices for chlorite, smectite, vermiculite, mica, dolomite, titanium-rich minerals, pyroxenes and soot. These data can be useful in other models.
For more detailed information and datasets, please contact: Yuheng Zhang, yuheng98@tamu.edu or yuhengz98@qq.com.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT The definition of experimental plot size is an essential tool to ensure precision in statistical analysis in experiments. The objective of this study was to estimate the plot size for the cactus pear cv. Gigante using the Modified Maximum Curvature Method, under the semi-arid conditions of Northeastern Brazil. The uniformity test was conducted at the Federal Institute of Bahia, Guanambi Campus, Bahia state, Brazil, during the agricultural period from 2009 to 2011. The spatial arrangement was composed of ten rows with 50 plants each, whose evaluated area was formed by the eight central rows with 48 plants per row, making 384 plants and area of 153.60 m2. The following variables were evaluated: plant height; length, width and thickness of cladode; number of cladodes; total area of cladodes; cladode area and green mass yield in the third production cycle. In the evaluations, each plant was considered as a basic experimental unit (BEU), with an area of 0.4 m2, comprising 384 basic units (BU), whose adjacent ones were combined to form 15 pre-established plot sizes with rectangular shapes and in rows. The characteristics total area of cladodes and green mass yield require larger plot sizes to be evaluated with greater experimental accuracy. For experimental evaluation of cactus pear cv. Gigante, plot size should be eight plants in the direction of the crop row.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The Market_Basket_Optimisation dataset is a classic transactional dataset often used in association rule mining and market basket analysis.
It consists of multiple transactions where each transaction represents the collection of items purchased together by a customer in a single shopping trip.
Market_Basket_Optimisation.csv Example transaction rows (simplified):
| Item 1 | Item 2 | Item 3 | Item 4 | ... |
|---|---|---|---|---|
| Bread | Butter | Jam | ||
| Mineral water | Chocolate | Eggs | Milk | |
| Spaghetti | Tomato sauce | Parmesan |
Here, empty cells mean no item was purchased in that slot.
This dataset is frequently used in data mining, analytics, and recommendation systems. Common applications include:
Association Rule Mining (Apriori, FP-Growth):
{Bread, Butter} ⇒ {Jam} with high support and confidence. Product Affinity Analysis:
Recommendation Engines:
Marketing Campaigns:
Inventory Management:
No Customer Identifiers:
No Timestamps:
No Quantities or Prices:
Sparse & Noisy:
Facebook
TwitterCollected COVID-19 datasets from various sources as part of DAAN-888 course, Penn State, Spring 2022. Collaborators: Mohamed Abdelgayed, Heather Beckwith, Mayank Sharma, Suradech Kongkiatpaiboon, and Alex Stroud
**1 - COVID-19 Data in the United States ** Source: The data is collected from multiple public health official sources by NY Times journalists and compiled in one single file. Description: Daily count of new COVID-19 cases and deaths for each state. Data is updated daily and runs from 1/21/2020 to 2/4/2022. URL: https://github.com/nytimes/covid-19-data/blob/master/us-states.csv Data size: 38,814 row and 5 columns.
**2 - Mask-Wearing Survey Data ** Source: The New York Times is releasing estimates of mask usage by county in the United States. Description: This data comes from a large number of interviews conducted online by the global data and survey firm Dynata, at the request of The New York Times. The firm asked a question about mask usage to obtain 250,000 survey responses between July 2 and July 14, enough data to provide estimates more detailed than the state level. URL: https://github.com/nytimes/covid-19-data/blob/master/mask-use/mask-use-by-county.csv Data size: 3,142 rows and 6 columns
**3a - Vaccine Data – Global **
Source: This data comes from the US Centers for Disease Control and Prevention (CDC), Our World in Data (OWiD) and the World Health Organization (WHO).
Description: Time series data of vaccine doses administered and the number of fully and partially vaccinated people by country. This data was last updated on February 3, 2022
URL: https://github.com/govex/COVID-19/blob/master/data_tables/vaccine_data/global_data/time_series_covid19_vaccine_global.csv
Data Size: 162,521 rows and 8 columns
**3b -Vaccine Data – United States **
Source: The data is comprised of individual State's public dashboards and data from the US Centers for Disease Control and Prevention (CDC).
Description: Time series data of the total vaccine doses shipped and administered by manufacturer, the dose number (first or second) by state. This data was last updated on February 3, 2022.
URL: https://github.com/govex/COVID-19/blob/master/data_tables/vaccine_data/us_data/time_series/vaccine_data_us_timeline.csv
Data Size: 141,503 rows and 13 columns
**4 - Testing Data **
Source: The data is comprised of individual State's public dashboards and data from the U.S. Department of Health & Human Services.
Description: Time series data of total tests administered by county and state. This data was last updated on January 25, 2022.
URL: https://github.com/govex/COVID-19/blob/master/data_tables/testing_data/county_time_series_covid19_US.csv
Data size: 322,154 rows and 8 columns
**5 – US State and Territorial Public Mask Mandates ** Source: Data from state and territory executive orders, administrative orders, resolutions, and proclamations is gathered from government websites and cataloged and coded by one coder using Microsoft Excel, with quality checking provided by one or more other coders. Description: US State and Territorial Public Mask Mandates from April 10, 2020 through August 15, 2021 by County by Day URL: https://data.cdc.gov/Policy-Surveillance/U-S-State-and-Territorial-Public-Mask-Mandates-Fro/62d6-pm5i Data Size: 1,593,869 rows and 10 columns
**6 – Case Counts & Transmission Level **
Source: This open-source dataset contains seven data items that describe community transmission levels across all counties. This dataset provides the same numbers used to show transmission maps on the COVID Data Tracker and contains reported daily transmission levels at the county level. The dataset is updated every day to include the most current day's data. The calculating procedures below are used to adjust the transmission level to low, moderate, considerable, or high.
Description: US State and County case counts and transmission level from 16-Aug-2021 to 03-Feb-2022
URL: https://data.cdc.gov/Public-Health-Surveillance/United-States-COVID-19-County-Level-of-Community-T/8396-v7yb
Data Size: 550,702 rows and 7 columns
**7 - World Cases & Vaccination Counts **
Source: This is an open-source dataset collected and maintained by Our World in Data. OWID provides research and data to help against the world’s largest problems.
Description: This dataset includes vaccinations, tests & positivity, hospital & ICU, confirmed cases, confirmed deaths, reproduction rate, policy responses and other variables of interest.
URL: https://github.com/owid/covid-19-data/tree/master/public/data
Data Size: 67 columns and 157,000 rows
**8 - COVID-19 Data in the European Union **
Source: This is an open-source dataset collected and maintained by ECDC. It is an EU agency aimed at strengthening Europe's defenses against infectious diseases.
Description: This dataset co...