Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains insights into a collection of credit card transactions made in India, offering a comprehensive look at the spending habits of Indians across the nation. From the Gender and Card type used to carry out each transaction, to which city saw the highest amount of spending and even what kind of expenses were made, this dataset paints an overall picture about how money is being spent in India today. With its variety in variables, researchers have an opportunity to uncover deeper trends in customer spending as well as interesting correlations between data points that can serve as invaluable business intelligence. Whether you're interested in learning more about customer preferences or simply exploring unbiased data analysis techniques, this data is sure to provide insight beyond what one could anticipate
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This data set was extracted from the field of Plate Heat-Exchanger design. As such, I understand this to be the first of its type.
There are around 60k data points, each consisting of 22 tags.
Thank you to Rarefied Technologies (SE Asia), who commissioned the data collection software & extracted the data points.
There are a number of very interesting phenomena waiting to be discovered, in this data set. It is useful to combine the data into useful variables - to discover the inner secrets of this data. Folks are most welcome to contact me for tips. Most of all - have fun...
Facebook
TwitterThe total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly. While it was estimated at ***** zettabytes in 2025, the forecast for 2029 stands at ***** zettabytes. Thus, global data generation will triple between 2025 and 2029. Data creation has been expanding continuously over the past decade. In 2020, the growth was higher than previously expected, caused by the increased demand due to the coronavirus (COVID-19) pandemic, as more people worked and learned from home and used home entertainment options more often.
Facebook
TwitterThis dataset provides information about the number of properties, residents, and average property values for Cool Ridge Road cross streets in Moss Point, MS.
Facebook
TwitterI hope that by uploading some of the data previously shared by [SETI][1] onto Kaggle, more people will become aware of SETI’s work and become engaged in the application of machine learning to the data (amongst other things). Note, I am in no way affiliated with SETI, I just think this is interesting data and amazing science.
If you’re reading this, then I’m guessing you have an interest in data science. And if you have an interest in data science, you’ve probably got an interest in science in general.
Out of every scientific endeavour undertaken by humanity, from mapping the human genome to landing a man on the moon, it seems to me that the Search for Extra-terrestrial Intelligence (SETI) has the greatest chance to fundamentally change how we think about our place in the Universe.
Just imagine if a signal was detected. Not natural. Not human. On the one hand it would be a Copernican-like demotion of mankind’s central place in the Cosmos, and on the other an awe-inspiring revelation that somewhere out there, at least once, extra-terrestrial intelligence emerged.
Over the past few years, SETI have launched a few initiatives to engage the public and ‘citizen scientists’ to help with their search. Below is a summary of their work to date (from what I can tell).
In January 2016, the Berkeley SETI Research Center at the University of Berkley started a program called Breakthrough Listen, described as “*the most comprehensive search for alien communications to date*”. Radio data is being currently been collected by the Green Bank Observatory in West Virginia and the Parkes Observatory in New South Wales, with optical data being collected by the Automated Planet finder in California. Note that (for now at least), the rest of this description focusses on the radio data.
The basic technique for finding a signal is this; point the telescope at a candidate object and listen for 5 minutes. If any sort of signal is detected, point slightly away and listen again. If the signal drops away, then it’s probably not terrestrial. Go back to the candidate and listen again. Is the signal still there? Now point to a second, slightly different position. How about now? The most interesting finding is, as you might expect, SIGNAL - NO SIGNAL – SIGNAL - NO SIGNAL – SIGNAL.
The Breakthrough Listen project has just about everything covered. The hardware and software to collect signals, the time, the money, and the experts to run the project. The only sticking point is the data. Even after compromising on the raw data’s time or frequency resolution, Breakthrough Listen is archiving 500GB and data every hour (!).
The resulting data are stored in something called a filterbank file, which are created at three different frequency resolutions. These are,
To engage the public, Breakthrough listen’s primary method is something called SETI@Home, where a program can be downloaded and installed, and your PC used when idle to download packets of data and run various analysis routines on them.
Beyond this, they have shared a number of starter scripts and some data. To find out more, a general landing page can be found [here][2]. The scripts can be found on GitHub [here]3, and a data archive can be found [here]4. Note that the optical data from the Automated Planet Finder is also in a different format called a FITS file.
The second initiative by SETI to engage the public was the SETI@IBMCloud project launched in September 2016. This provided the public with access to an enormous amount of data via the IBM Cloud platform. This initiative, too, came with an excellent collection of starter scripts which can still be found on GitHub [here][5]. Unfortunately, at the time of writing, this project is on hold and the data cannot be accessed.
There are a few other sources of data online from SETI, one of which is the basis for this dataset.
In the summer of 2017, SETI hosted a machine learning challenge where simulated datasets of various sizes were provided to participants along with a blinded test set. The winning team achieved a classification accuracy of 94.67% using a convolution neural network. The aim of this challenge was to attempt a novel approach to signal detection, namely to go beyond traditional signal analysis approaches and to turn the problem into an image classification task, after converting the signals into spectrograms.
The primary traini...
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
We will create a customized startups dataset tailored to your specific requirements. Data points may include startup foundation dates, locations, industry sectors, funding rounds, investor profiles, financial health, market positions, technological assets, employee counts, and other relevant metrics.
Utilize our startups datasets for a variety of applications to boost strategic planning and innovation tracking. Analyzing these datasets can help organizations grasp market trends and growth opportunities within the startup ecosystem, allowing for more precise strategy adjustments and operations. You can choose to access the complete dataset or a customized subset based on your business needs.
Popular use cases include: enhancing competitive analysis, identifying emerging market trends, and finding high-potential investment opportunities.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Quantiles and expectiles of a distribution are found to be useful descriptors of its tail in the same way as the median and mean are related to its central behavior. This article considers a valuable alternative class to expectiles, called extremiles, which parallels the class of quantiles and includes the family of expected minima and expected maxima. The new class is motivated via several angles, which reveals its specific merits and strengths. Extremiles suggest better capability of fitting both location and spread in data points and provide an appropriate theory that better displays the interesting features of long-tailed distributions. We discuss their estimation in the range of the data and beyond the sample maximum. A number of motivating examples are given to illustrate the utility of estimated extremiles in modeling noncentral behavior. There is in particular an interesting connection with coherent measures of risk protection. Supplementary materials for this article are available online.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We present a new CUSUM procedure for sequential change-point detection in self- and mutually-exciting point processes (specifically, Hawkes networks) using discrete events data. Hawkes networks have become a popular model in statistics and machine learning, primarily due to their capability in modeling irregularly observed data where the timing between events carries a lot of information. The problem of detecting abrupt changes in Hawkes networks arises from various applications, including neuroengineering, sensor networks, and social network monitoring. Despite this, there has not been an efficient online algorithm for detecting such changes from sequential data. To this end, we propose an online recursive implementation of the CUSUM statistic for Hawkes processes, which is computationally and memory-efficient and can be decentralized for distributed computing. We first prove theoretical properties of this new CUSUM procedure, then show the improved performance of this approach over existing methods, including the Shewhart procedure based on count data, the generalized likelihood ratio statistic, and the standard score statistic. This is demonstrated via simulation studies and an application to population code change-detection in neuroengineering.
Facebook
TwitterDating from 1948, the photos provide an accurate historical record of the land, and form an important part of Western Australia's spatial information. Here are some interesting examples captured and archived by Landgate. This dataset contains vectorised centroid points for the theme park. To view the aerial photography for the sites please refer to https://catalogue.data.wa.gov.au/group/interesting-historical-photography © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions.
Facebook
TwitterTable S4: Body temperature dataThis .csv file contains data used in the analyses of body temperature and thermoregulation in the manuscript "It's cool to be dominant". Key to the column headings as follows:
date: date of collection of data point. time.round: time of collection of data point, rounded to the nearest minute. time.deci: time.round converted to a decimal. species: bird species (common names). bird.id: unique colour-ring code of the bird. Tb: body temperature of the bird. soc.rank: social rank of the bird (1 = most dominant). Ta: air temperature as recorded by the weather station. tod: time of day: day = 6:00am – 7:59pm; night = 8:00pm - 5:59am. mass: capture body mass of the bird. sex: sex of the bird, m = male, f= female, u = undetermined.Table S4.csvTable S5: Behavioural dataThis .csv file contains data used in the analyses of behavioural thermoregulatory strategies shade-seeking, activity, and panting in the manuscript "It's cool to be dominant". Key to the column head...
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Predicting which ecological factors constrain species distributions is a fundamental ecological question and critical to forecasting geographic responses to global change. Darwin hypothesized that abiotic factors generally impose species’ high-latitude and high-elevation (typically cool) range limits, whereas biotic interactions more often impose species’ low-latitude/low-elevation (typically warm) limits, but empirical support has been mixed. Here, we clarify three predictions arising from Darwin’s hypothesis, and show that previously mixed support is partially due to researchers testing different predictions. Using a comprehensive literature review (885 range limits), we find that biotic interactions, including competition, predation, and parasitism, contributed to >60% of range limits, and influenced species’ warm limits more often than cool limits. Abiotic factors contributed more often than biotic interactions to cool range limits, but temperature contributed frequently to both cool and warm limits. Our results suggest that most range limits will be sensitive to climate warming, but warm-limit responses will depend strongly on biotic interactions.
Methods There are two sets of data: data on potentially range-limiting factors (one file for cool limits and one file for warm limits), and data for locations of each data point (again, one file for cool limits, one for warm limits).
See Supplementary Materials for full description. In brief:
We searched Web of Science for studies published up to the end of 2019 that assessed the causes of species’ high latitude/elevation (hereafter ‘cool’) or low latitude/elevation (hereafter ‘warm’) range limits. To increase coverage in areas with few studies, we repeated the search in Spanish and French and did a targeted search for studies from Africa. We screened results for studies that assessed the importance of at least one biotic or abiotic factor in causing a cool or warm range limit.
We extracted data for each potentially range-limiting factor assessed, separating data by study and species whenever possible. For each study x taxon x range limit (latitude or elevation, cool or warm, separated by continent or ocean if applicable), we identified the potential range-limiting factors assessed, such that each factor a study assessed at a given range limit contributed 1 data point. We noted whether each factor was biotic or abiotic (‘factor type’) and what category of factor it was. Multiple assessments of a factor category (e.g. max. and mean annual temperature) at one species’ range limit would contribute one data point. Factors outside these categories were assigned as ‘other’.
We collected various meta-data about each data point, as explained in the 'column headings explained' sheet.
We assessed whether each factor contributed to the given range limit (‘yes’ or ‘no’), determined from statistical results, figures, and author arguments when necessary. Data and reasoning behind these decisions are given in separate columns. If a study considered >1 measure of one factor (e.g. summer and winter temperature), we deemed the factor (temperature in this example) supported if any measure contributed to the range limit. For studies in Cahill et al. (2014), we used their conclusions unless a) data were grouped across species or studies, in which case we ungrouped data and reassessed conclusions for each species/study, or b) we spot checked the study and could not find evidence to support the conclusion. These decisions are also detailed in the data.
Facebook
TwitterMeasuring the Internet is indispensable to a better understanding of its current state and trends, but obtaining measurements is difficult both qualitatively and quantitatively. The challenges of “Internet Measurement” are manifold due to the nature of the beast, namely its complexity, distribution, and constant change. The Internet grows continuously and since it consists of many interdependent autonomous systems, there is no ground truth regarding how it looks – not even retrospectively. Nonetheless, we rely on a fundamental understanding of its state and dynamics to approach a solution. Since it is impractical to understand such complex systems at once – research on complex systems is older than the Internet itself – in this study we focus on a better understanding of the key players on the Internet by measuring Internet service providers (ISPs), Internet exchange points (IXPs), and systems running the Internet, such as routing, packet exchange, and the Domain Name System (DNS). We describe our methodology of passively measuring large amounts of network traffic data at different vantage points, and discuss the challenges, solutions, and best practices that we experienced in the course of our work. Our measurements require an understanding of the vantage points in terms of their characteristics, the systems we measure, and the data we obtain. In the course of the work, we do not exclusively rely on passive data collection and its analysis. Instead, combining our active and passive measurements helps us to improve the understanding of the data in the domain of Internet network operation. Our primary findings regard the role of IXPs in the current Internet ecosystem. We find that IXPs are understudied compared to their importance as hubs for exchanging Internet traffic, some of them handling traffic volumes comparable to major ISPs. We identify and describe different models of IXPs’ operation specific to marketplaces, namely Europe and North America. We make use of different kinds of publicly available data and proprietary data collection of Internet traffic to which we have been granted access. Our measurement results show that the Internet peering complexity is higher than anticipated in previous publications, and that IXPs are the key to this unexpected complexity. This highlights the importance of IXPs and the role they play in today’s Internet ecosystem. To further improve our understanding of global players’ operation in the Internet, we use a DNS protocol extension (EDNS0) to reveal the mapping of users to servers for one of the early adopters of this extension. The elegance of this particular measurement is in its ability to run a global crawl from a single vantage point without the need to access proprietary data or a significant amount of infrastructure. We find it useful to examine both dominant and emerging Internet components to gain a better understanding of how the Internet changes and how it is used. It is critical to measure the Internet’s driving forces, but this is a difficult task and comes with technical and legal restrictions. In order to make the best use of the data we have, it is possible and practical to combine measurement methods. As the Internet evolves constantly and rapidly, the quest to understand becomes more challenging by the hour. However, even without access to private data it is possible to find exciting details regarding how this large system is operated.
Facebook
TwitterThe data are images of tissue cultures of Valencia sweet orange nonembryogenic callus cells taken in 2006 at the U.S. Horticultural Research Laboratory, Ft. Pierce, Florida, USA to photo document treatments from an experiment designed to determine the effects of mineral nutrition on callus growth using a 5-factor response surface methodology (RSM) design. The design matrix is presented in the spreadsheet file 5 factor RSM Design_11-28-2005.xlsx. The design included 46 runs divided into 3 blocks. Six culture dishes were used to estimate the response for each run. A culture dish representative of the run was photographed. Each image is named to match the run of the design. For example, the jpg image labeled Run 04_Blk1_5F RSM_NE_Valencia_11-28-2005.JPG is an image of the callus grown on Run #4 that was part of Block 1 of the 5-factor RSM design for the nonembryogenic Valencia sweet orange callus cells. This dataset includes 47 files – 46 image files and 1 Excel spreadsheet. Images were captured in JPEG (EXIF 2.2) format with a Nikon Coolpix 5400 digital camera equipped with a 1/1.8” (7.2 x 5.3 mm) CCD sensor at a resolution of 2592 x 1944 pixels. Each image is of a single culture plate with the top lid removed and photographed under cool white, fluorescent lighting.The experimental setup for the 5-factor response surface design is described in - Niedz, R. P. and T. J. Evens (2007). "Regulating plant tissue growth by mineral nutrition." In Vitro Cellular & Developmental Biology - Plant 43(4): 370-381.
Facebook
TwitterDescription of the Points of Interest Straße der Romanik in Saxony-Anhalt.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This comprehensive dataset offers an in-depth exploration into US travel check-ins from Instagram. It includes detailed data scraped from Instagram, such as the location of each check-in, the USIndex for each state, average temperature for each state per month, and crime rate per state. In addition to location and time information, this dataset also provides latitude and longitude coordinates for every entry. This extensive collection of data is invaluable for those interested in studying various aspects of movement within the United States. With detailed insights on factors like climate conditions and economic health of a region at a given point in time, this dataset can help uncover fascinating trends regarding how travelers choose their destinations and how they experience their journeys around the country
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This Kaggle dataset - US Travel Check-Ins Analysis - provides valuable insights for travel researchers, marketers and businesses in the travel industry. It contains check-in location, USIndex rating (economic health of each state), average temperature, and crime rate per state. Latitude and longitude of each check-ins are also provided with added geographic context to help you visualize the data.
This guide will show you how to use this dataset for your research or business venture.
Step 1: Prepare your data First and foremost, it is important to cleanse your data before you can analyze it. Depending on what sort of analysis needs to be conducted (e.g., time series analysis) you will need to select the applicable columns from the dataset that match your needs best and exclude any unnecessary columns such as dates or season related data points as they are not relevant here. Furthermore, variable formatting should be consistent across all instances in a variable/column category as well (elevation is a good example here). You can always double check that everything is formatted correctly by running a quick summary on selected columns using conditional queries like df['var'].describe() command in Python for descriptive results about an entire column’s statistical makeup including mean values, quartile ranges etc..
Step 2: Explore & Analyze Your Data Graphically Once the data has been prepped properly you can start visualizing it in order to gain better insights into any trends or patterns that may be present within it when compared with other datasets or information sources simultaneously such as weather forecasts or nationwide trend indicators etc.. Grafana dashboards are feasible solutions when multiple dataset need to be compared but depending on what type of graphs/charts being used Excel worksheet formats can offer great customization options flexiblity along with various export file types (.csv; .jpegs; .pdfs). Plotting markers onto map applications like Google Maps API offers more geographical awareness that could useful when analyzing location dependent variables too which means we have one advantage over manual inspection tasks just by leveraging existing software applications alongside publicly available APIs!
Step 3: Interpretation & Hypothesis Testing
After generating informative graphical interpretation from exploratory visualizations the next step would involve testing out various hypotheses based on established correlations between different variables derived from overall quantitative estimates vizualizations regarding distribution trends across different regions tends towards geographical areas where certain logistical processes could yeild higher success ratios giving potential customers greater satisfaction than
- Travel trends analysis: Using this dataset, researchers could track which areas of the US are popular destinations based on travel check-ins and spot any interesting trends or correlations in terms of geography, seasonal changes, economic health or crime rates.
- Predictive Modeling: By using various features from this dataset such as average temperature, US Index and crime rate, predictors could be developed to suggest how safe an area would feel to a tourist based on their current location and other predetermined variables they choose to input into the model.
- Trip Planning Tool: The dataset can also be used to develop a tool that quickly allows travelers to plan trips according to their preferences in terms of duration and budget as well a...
Facebook
Twitterhttps://datos.madrid.es/egob/catalogo/aviso-legalhttps://datos.madrid.es/egob/catalogo/aviso-legal
In this dataset you can find the addresses, collection times and geolocated points of each of the containers of used vegetable oil that are available in the districts. In total there are more than 550 different locations in the city where the citizen can deposit the used vegetable oil: containers of used vegetable oil installed in public roads, markets and galleries, municipal offices, can also deposit used vegetable oil in: — Fixed clean points (indicated in the file below and in this link) — Clean points of proximity (you can access them in this link) — Mobile clean points (you can access them at this link) In the city of Madrid, apart from these points for used vegetable oil, there are also the following to favor recycling: Clean Points Fixed Mobile Clean Points Clean Points of Proximity Authorised Clothing Containers of Madrid City Council Containers of Cells in bus canisters Paper-carton containers, glass, packaging, organic and remains It is also very interesting this other related data set: Types of waste and where to deposit them Very important, before approaching the indicated points, please review the possible schedules and days.
Facebook
TwitterIf you want this dataset, kindly fill the "Request access" form towards the bottom of this page and also mail at : soumyadeep.roy9@gmail.com.
Kindly cite the paper : https://dl.acm.org/citation.cfm?id=3326048
BibTex :
@inproceedings{Roy:2019:UBC:3292522.3326048, author = {Roy, Soumyadeep and Ganguly, Niloy and Sural, Shamik and Chhaya, Niyati and Natarajan, Anandhavelu}, title = {Understanding Brand Consistency from Web Content}, booktitle = {Proceedings of the 10th ACM Conference on Web Science}, series = {WebSci '19}, year = {2019}, isbn = {978-1-4503-6202-3}, location = {Boston, Massachusetts, USA}, pages = {245--253}, numpages = {9}, url = {http://doi.acm.org/10.1145/3292522.3326048}, doi = {10.1145/3292522.3326048}, acmid = {3326048}, publisher = {ACM}, address = {New York, NY, USA}, keywords = {affective computing, brand personality, reputation management, text classification}, }
Abstract :
Brands produce content to engage with the audience continually and tend to maintain a set of human characteristics in their marketing campaigns. In this era of digital marketing, they need to create a lot of content to keep up the engagement with their audiences. However, such kind of content authoring at scale introduces challenges in maintaining consistency in a brand's messaging tone, which is very important from a brand's perspective to ensure a persistent impression for its customers and audiences. In this work, we quantify brand personality and formulate its linguistic features. We score text articles extracted from brand communications on five personality dimensions: sincerity, excitement, competence, ruggedness and sophistication, and show that a linear SVM model achieves a decent F1 score of $0.822$. The linear SVM allows us to annotate a large set of data points free of any annotation error. We utilize this huge annotated dataset to characterize the notion of brand consistency, which is maintaining a company's targeted brand personality across time and over different content categories; we make certain interesting observations. As per our knowledge, this is the first study which investigates brand personality from the company's official websites, and that formulates and analyzes the notion of brand consistency on such a large scale.
Dataset description: Each file contain the scrapped textual content from the official webpages of Fortune 1000 companies. We use the 2017 Fortune 1000 list ranks. Please read the paper for details about data collection and cleaning
Directory structure : compressed size - 3.7 GB, uncompressed size - 28.9 GB
├── Cleaned MTlarge data │ ├── final_dynamic_data.csv (1.0 GB) : Dynamic pages per company │ └── final_static_data.csv (3.8 MB) : Static pages for each company └── Raw Scrapped Data (27.8 GB) ├── first50fortune.csv : contains raw scrapped files for Fortune 1000 companies between the rank 1 and 50 ├── fortune150_300.csv : Between Rank 150 and 300 ├── fortune300_500.csv : Between Rank 300 to 500 ├── fortune500_550.csv : Between Rank 500 and 550 ├── fortune50_150.csv : Between Rank 50 and 150 ├── fortune550_800.csv : Between Rank 550 and 800 └── fortune800_1000.csv : Between Rank 800 and 1000
Facebook
TwitterThis dataset represents amenities activated as a part of Cool It! NYC, a Citywide plan to increase the amount of cooling features available to the public during heat emergencies, particularly in neighborhoods that face the dangers of high heat. This is part of the Cool It! NYC 2020 Data Collection, which includes the following amenities:
Drinking Fountains: Indicates whether a drinking fountain is activated, not yet activated, broken, or under construction.
Spray Showers: Indicates whether a spray shower installed before July 2020 is activated, not yet activated, broken, or under construction. At this time, spray showers are mapped to the middle of parks.
Cooling Sites: To measure neighborhoods that are the most at risk during extreme heat, NYC Health and Columbia University developed the New York City Heat Vulnerability Index, or HVI. Parks used this data to direct new cooling elements to neighborhoods with HVIs of 4 and 5.
Data Dictionary: https://docs.google.com/spreadsheets/d/1GpXHX9p0e520LcAf3gstOKTQm64wxkdDUiACjhMwd9Q/edit?usp=sharing
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This report of a survey carried out under contract to the NCC, describes the shores of Lewis from Mealista on the west coast, north to the Butt of Lewis and south down the east coast as far as the Eye Peninsula, east of Stornaway. Full descriptions of the 33 stations visited are given and excerpts from Powell et al. (1979) included where these also describe the same sites. The area covered in the report is very large, some 150 km of coast excluding the Loch Roag complex. Apart from Loch Roag, the shores are mostly open, exposed to extremely exposed and except in their detail show little variety. Loch Roag, by contrast, is largely sheltered and has a great variety of habitats, rocky shores, various sediments, rapids, lagoons and brackish areas. The rocks are predominantly Lewisian Gneiss which restricts crevice and soft rock fauna. All littoral species which could be identified were recorded, but the list is weighted towards Mollusca. Attention is drawn towards species of particular significance for being at or near the limit of their range. It is noted that autumn populations of mollusca were more diverse and contained higher populations of animals than spring populations. The very exposed and most sheltered shores had the lowest diversity and, normally, lowest numbers of animals, and normally the sheltered sites were the most prolific. The author considers that Lewis is shown by the survey to be as rich and interesting as the more southern isles, and following upon Powell (1979), indicates that some areas are worthy of the highest conservation interest. An appraisal of the biological interest and importance of the shores of the Lewis area is made by the author, the coast being divided into regions. Of these, Loch Roag was found to be extremely rich and interesting in the intertidal, more so than other parts of Lewis with the possible exception of Loch Erisort (Smith, 1982). Judged as one of the most interesting and important areas in western Scotland with its great variety of habitat and high diversity of mollusca, it was assessed as Grade 1+. The coast south-west of Loch Roag is one of the most exposed in the Outer Hebrides, isolated, with an irregular coastline and interesting hanging saltmarsh at the tops of the cliffs. This region is assessed as Grade 1-2 for exposure and wildnerness. The north-west coast of Lewis is a uniform area of coastline which was assessed with reservations as Grade 2. The east coast of Lewis comprised several contrasting zones, much not of any especial importance. Port Skigersta is assessed as Grade 3 and Melbost Point as Grade 1. Records currently considered sensitive have been removed from this dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In multiple time series data, clustering the component profiles can identify meaningful latent groups while also detecting interesting change points in their trajectories. Conventional time series clustering methods, however, suffer the drawback of requiring the co-clustered units to have the same cluster membership throughout the entire time domain. In contrast to these “global” clustering methods, we develop a Bayesian “local” clustering method that allows the functions to flexibly change their cluster memberships over time. We design a Markov chain Monte Carlo algorithm to implement our method. We illustrate the method in several real-world datasets, where time-varying cluster memberships provide meaningful inferences about the underlying processes. These include a public health dataset to showcase the more detailed inference our method can provide over global clustering alternatives, and a temperature dataset to demonstrate our method’s utility as a flexible change point detection method. Supplemental materials for this article, including R codes implementing the method, are available online.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains insights into a collection of credit card transactions made in India, offering a comprehensive look at the spending habits of Indians across the nation. From the Gender and Card type used to carry out each transaction, to which city saw the highest amount of spending and even what kind of expenses were made, this dataset paints an overall picture about how money is being spent in India today. With its variety in variables, researchers have an opportunity to uncover deeper trends in customer spending as well as interesting correlations between data points that can serve as invaluable business intelligence. Whether you're interested in learning more about customer preferences or simply exploring unbiased data analysis techniques, this data is sure to provide insight beyond what one could anticipate