Getting clean, reliable data, with a large enough population to be useful for any study, is the hardest part of any project, and it is often the part that causes a project to be abandoned. You need to refine your query to make sure that you are collecting the data that will best answer your questions.
Are you looking at emergency admissions only, or do you want to include scheduled surgeries, routine check-ups, etc. as well?
Are you interested in following one hospital in one city, or are you looking for a more general model to work with any hospital?
Are you looking for one holiday in particular (e.g. Halloween) to see if there are spikes around it, or are you looking for trends throughout the year with the idea that you expect spikes around holidays?
It sounds like you need the following:
Hospital Data:
- Dates of Hospital Admission
- Reason for Admission (scheduled surgery? emergency? routine checkup?)
- Length of stay / Diagnosis? (do you want / need this?)
- Location of hospital
- Rates of readmission within one month?
Weather Data:
- Weather around the dates in the first dataset (maybe up to one or two weeks back, for instances of pneumonia for example?)
Holiday Data:
- Not really sure what you are looking for here: traffic patterns? Instances of car crashes? Economic returns, indicating the presence of many people out shopping?
The Weather Channel is working with IBM and Watson, so we should have some access to that data, though you might have to connect with the Watson team to get it.
The Holiday data needs some definition.
The Hospital data is by far the hardest to get: you have HIPPA restrictions and all sorts of privacy / security concerns, so most medical facilities are very reluctant to give over access to their data. If you connect with a particular hospital and define exactly what you want to do, they might be willing to give over some of their data, but it is highly unlikely unless you can provide a good reason that you are working on this project. Is it for a class or is it just because you are curious? I'm sure there are IBM groups working with on these types of questions, but they would have been contracted by the hospitals for just this purpose, so they could have access to the data.
Net: You need to refine your query to get a better definition of the question you are trying to answer. That will better define the data you need.
Good luck!
------------------------------
Tracey Newton
------------------------------
Original Message:
Sent: 01-28-2019 03:19 AM
From: Konan Jean-Claude Kouassi
Subject: Beginner's Query
I think what you need to do is "features engineering" with your data, in order to predict 'Future Admissions Trends' as labels.
That could be performed through various ML/AI frameworks (TensorFlow, Pytorch, etc.), but as we are here on IBM community you could look at the diverse tools and choose the one which can be easier for a beginner.
------------------------------
Konan Jean-Claude Kouassi
Practice Makes Perfect!
Original Message:
Sent: 01-25-2019 07:26 AM
From: Syed Zain
Subject: Beginner's Query
Respected seniors and fellow members,
I have recently joined this forum and I hope to make the most of this opportunity.
Recently, I started a project but couldn't go any further because of my limited knowledge.
My goal is to use Hospital Admissions Data Cross-Referenced with Weather & Holiday Data over a Significant Period of Time to Predict Future Admissions Trends.
However, I am stuck on the first stage and I am unable to get a proper data-set to use for this project.
Can anyone out there help out a beginner such as myself?
Apart from that, once I obtain the data, how do I go about this project?
Any help will be highly appreciated.
Thank you :)
------------------------------
Syed Zain
------------------------------
#GlobalAIandDataScience
#GlobalDataScience