[Population Modeling] Webinar: Machine Learning Using Healthcare.ai: A Hands-on Learning Session

Jacob Barhak jacob.barhak at gmail.com
Wed Feb 1 02:24:40 PST 2017


Greeting Modelers,

Those of you interested in machine learning in healthcare may want to tune
to the webinar next week.

I am forwarding a message I got from Health Catalyst after editing the
message for compatibility..

Hopefully this will interest some of you.

              Jacob


---------- Forwarded message ----------
From: Jonathan Weege <jonathan.weege at healthcatalyst.com>
Date: Tue, Jan 31, 2017 at 11:02 AM
Subject: Pass this along
To: jacob.barhak at gmail.com


This week, we are featuring a unique webinar led by our data scientists who
will explain more about the growing field of machine learning in
healthcare. They will show how to develop a machine learning model using
our free healthcare.ai website. This website contains tools and tutorials
to help those in your organizations who are interested in building machine
learning models and collaborating with others. It is supported by experts
at Health Catalyst and does not require you to be a customer or to be on
our Health Catalyst platforms. Our interest is in promoting the use of
machine learning across healthcare.

If you are not the right person to attend this webinar, please pass this
email on to the analysts within your organization who might be interested
in attending.

Also, included below are several helpful blogs for those that wish to read
and do more on their own with machine learning.
<http://go.healthcatalyst.com/n0Cg0zM030Pw0ER0Q700n26>

*Featured Webinar*

*Machine Learning Using Healthcare.ai: A Hands-on Learning Session
<http://go.healthcatalyst.com/DCA600N0ERQ00207wP3g0n0>*

*Levi Thatcher, PhD, Director of Data Science, Health CatalystMichael
Mastanduno, PhD, Data Scientist, Health Catalyst*

*Wednesday, Feb 8 1-2PM EST*

The purpose of this webinar is to take a tour of Healthcare.ai, a free,
predictive analytics platform that is tailored to healthcare, and to get a
hands-on demo of how to develop a machine learning model. The package
allows users to pull data, implement machine learning algorithms to make
predictions, and improve patient outcomes in real clinical settings. While
there is other free software to develop predictive models, many
organizations fail to provide a sufficient infrastructure to achieve value
from it. Healthcare.ai provides tools to help walk a user through the
process of data cleansing, model selection, performance evaluation, and
interpretation. Finally, the package supports SQL-in, SQL-out data flow to
be truly embedded into the analytics environment and make deployment
reliable and consistent. The webinar will go through the capabilities of
healthcare.ai and then go on to show the process of installation, real
models being built, and deployed. We’ll dive into the R software
environment and show you how easy it is to use machine learning in your
analytics platform.

Levi Thatcher, Health Catalyst Director of Data Science and his team will
provide a live demonstration of using healthcare.ai to implement a
healthcare-specific machine learning model from data source to patient
impact. Levi will go through a hands-on coding example while sharing his
insights on the value of predictive analytics, the best path towards
implementation, and avoiding common pitfalls. Frequently asked questions
since the announcement will be shared and answered during the session.

During the webinar, we will:

   1. Describe and install healthcare.ai
   2. Build and evaluate a machine learning model
   3. Deploy interpretable predictions to SQL Server
   4. Discuss the process of deploying into a live analytics environment.

If you’d like to follow along, you should download and install *R
<http://go.healthcatalyst.com/p0600C07OE000BPRnw0gQ23>* and *RStudio
<http://go.healthcatalyst.com/DCC600P0ERQ00207wP3g0n0>* prior to the event.
We look forward to you joining us!

Click here to register
<http://go.healthcatalyst.com/DCA600N0ERQ00207wP3g0n0>
------------------------------

*Learn More On Your Own*

For those interested in learning more on their own, before or after the
webinar, here is an extensive list of 15 article summaries that we have
published for anyone interested in learning more about machine learning in
healthcare. Clicking on any title will go to the full page article on the
site. Click on any or all of these articles to learn more about applying
machine learning in your environment.

Please pass this along to the data analysts or data architects in your
organization who might be interesting in sharpening their skills in machine
learning and predictive analytics.

*#1: Welcome to healthcare.ai
<http://go.healthcatalyst.com/Q02E0QR0CD3w00P7n00Q6g0>*
Machine Learning (ML) has existed in other industries for years. Health
Catalyst® is bringing ML’s ability to detect complex patterns to healthcare
with its new open source predictive analytics software called healthcare.ai.
Healthcare.ai democratizes ML by making it available to anyone interested
in healthcare data with the right skills (e.g., BI developers). It
eliminates the need for health systems to hire expensive data scientists to
do this work. Healthcare.ai offers healthcare-specific ML packages (R and
Python), analysis, commentary, and advice on how to leverage ML in any
health system (regardless of size) to improve operational, financial, and
clinical efficiencies. Healthcare.ai’s primary goal is improving patient
outcomes. Healthcare organizations’ ability to leverage ML with
healthcare.ai is critically important given the enormous amount of data in
EMRs and the thousands of patient lives at risk in hospitals every day.

*#2: Why R and Python?
<http://go.healthcatalyst.com/n0Cg0ER030Pw0ER0Q700n26>*
Health Catalyst® chose the statistical tools R and Python for healthcare.ai
for two key reasons:

   1. They are open source – free tools democratize machine learning (ML)
   and increase healthcare data literacy.
   2. They offer the necessary breadth and depth of capabilities – Python
   supports ML, web development, web scraping, and desktop applications. R
   offers well-documented algorithms and tools to do anything in statistics,
   has excellent visualization software, and excels at professional document
   generation (e.g., reports).

Although there are many other statistical tools, such as SAS, Stata, SPSS,
etc., and languages, such as Java, C++, and C#, they’re compiled, making
them difficult to use for data analysis. Healthcare.ai leverages R and
Python because they are the best open-source tools with the right
capabilities to improve patient outcomes.

*#3: The Benefits of Machine Learning in Healthcare
<http://go.healthcatalyst.com/MP0E000QC267n00Sw0gR0F3>*

Healthcare organizations around the country are starting to hear more about
machine learning (ML) and wonder how it can help their clinical teams
improve patient outcomes. There are two key ways ML can help:

   1. ML learns the important relationships in systems’ healthcare data on
      past patients and their outcomes, creating a customized model based on a
      health system’s data.
      2. ML allows health systems to create models based on whatever data
      is available when they need a risk score (e.g., upon admission
rather than
      discharge).

ML benefits healthcare because it delivers accurate, timely risk scores,
enables effective resource allocation, and, ultimately, lowers costs and
improves patient outcomes. For example, Health Catalyst®’s open source,
predictive analytics software—healthcare.ai—can demonstrate why a risk
score was high so clinicians know not only which patients are most at risk,
but also what can be done to lower those patients’ risk.

*#4: The Technical Need for healthcare.ai
<http://go.healthcatalyst.com/y72CQ0gP0n0wERG06T00030>*
When it comes to machine learning (ML) in healthcare, health system leaders
are asking two key questions:

   1. How will healthcare.ai, Health Catalyst’s new open source, predictive
   analytics software, enable my team of analysts or data scientists?
   2. How will healthcare.ai finally bring accurate, informative models to
   my health system?

The answers to these questions are rooted in the fact that healthcare.ai
provides systems with the gentle introduction to ML, R, and Python they so
desperately need to improve patient outcomes. Health Catalyst® wants models
health systems put into production to not only help patients now, but for
years to come. For example, healthcare.ai’s production-grade predictive
code ensures smooth transitions as people change jobs. Healhtcare.ai does
five key things for health systems:

   1. Offers pre-processing and algorithms appropriate for healthcare
      questions.
      2. Provides appropriate metrics to assess which algorithm generates
      the best model.
      3. Determines which features are most important for each model.
      4. Provides easy connectivity to databases.
      5. Allows systems to easily save and deploy models in production.

*#5: What Models has Health Catalyst Created with healthcare.ai?
<http://go.healthcatalyst.com/q7EC0n0QwU0P6203R0gH000>*
Health Catalyst® practices what it preaches with healthcare.ai, its simple,
flexible tool designed to streamline healthcare machine learning (ML). In
several recent predictive projects, healthcare.ai helped health systems
reduce readmissions.

Although general readmissions models are possible with healthcare.ai, these
projects focused on specific disease cohorts (e.g., heart failure) to
create more accurate models. It works by collecting relevant data on past
patients (and if they were readmitted) and creating models using
algorithms. As a result, clinicians received daily guidance on those
patients mostly likely to be readmitted.

Given the tremendous resource constraints healthcare organizations are up
against daily, this type of machine learning guidance can be crucial to
efficiently deploying resources toward achieving business goals (e.g.,
reducing readmissions, reducing one-yr mortality, preventing HAIs, etc.).

*#6: Model Evaluation Using ROC Curves
<http://go.healthcatalyst.com/z00Q3nw0006RPI27CgV0E00>*
Machine learning (ML) models, like any new technique, drug, or device in
healthcare, should be subject to rigorous testing and review before they
can be trusted by physicians and frontline staff. Since ML has the awesome
responsibility of guiding clinical decision making, trust and transparency
in the models must be absolute. The way to establish this trust is by
evaluating their overall performance, comparing them to other techniques,
and determining when they are ready for production.

A common way to do this is by charting the Receiver Operating
Characteristic Curve (ROC), a graphical representation that shows the
balance of True Positive Rate and False Positive Rate in classifying a data
set. More specifically, a score called the Area Under Curve (AUC) is a
precise way to compare predictive models across patient cohorts to
determine their trustworthiness prior to deployment.

*#7: Which Algorithms are in healthcare.ai?
<http://go.healthcatalyst.com/U0Wg06E0wR0Cn2037P000JQ>*
Machine learning (ML) infuses our lives in many ways (think Amazon,
Netflix, Facebook), but healthcare has yet to adopt it in a meaningful way.
Some brand-name big players are approaching ML in healthcare through deep
learning applications, but Health Catalyst® is taking a more practical
approach. The goal of healthcare.ai is to provide access to off-the-shelf
ML algorithms that can be paired with a healthcare system’s data to create
predictive models in an easy and efficient way. Healthcare.ai serves up
many algorithms:

   - Logistic Regression – a quick algorithm for classification problems.
   - Lasso – like logistic regression, but with the ability to provide
   feedback on which features to keep or omit from a model.
   - Random Forest – an ensemble method that can accurately model
   non-linear relationships.
   - Mixed Model – can combine a personal trend with population trends.

This post details the value of the healthcare.ai site to the medical
community as a resource for developing ML to improve healthcare outcomes.

*#8: Applications of Healthcare Machine Learning
<http://go.healthcatalyst.com/dn0wg7KP200EQ00R6X030C0>*
This post provides a broad overview of the different types of machine
learning (ML) and classifies the algorithms into three application types:

   1. Classification vs. Regression: classification algorithms produce a
      probability score while regression algorithms predict a continuous value.
      2. Supervised vs. Unsupervised: supervised machine learning problems
      have a basic truth associated with each feature being used to train the
      model; unsupervised don’t. Unsupervised algorithms identify patterns and
      use them to stratify data.
      3. Artificial Intelligence (AI): not yet highly visible in
      healthcare, but it’s making inroads with radiology and pathology (machine
      learning AI to assess images with high speed and accuracy).

With a better understanding of how the algorithms in each of these
classifications are applied to specific use cases, data architects and
scientists can begin thinking about how to apply them in their own
organizations.

*#9: Data Leakage in Healthcare Machine Learning
<http://go.healthcatalyst.com/HC003000PgYQ2E60w7R00Ln>*
Data leakage occurs when a predictive model is trained using information
that is available during the training, but not during the production of
predicting outcomes. Leakage can make a predictive model appear deceptively
accurate in training, but then produce untrustworthy results in deployment.
This post explores the four ways data leakage occurs:

   1. A feature is used to train the model that would not be available in
   production at the time of prediction.
   2. A feature is used to train the model that would not be available in
   production prior to the outcome variable being populated.
   3. A feature is used to train the model that is outside the scope of the
   model’s intended use case.
   4. The correct outcome is leaked into the test data through a variable
   that inherently proxies for the outcome.

It goes on to explore the consequences of data leakage, how to prevent it,
and how to identify and fix it in an existing model.

*#10: Which Regions of the US are Healthy?
<http://go.healthcatalyst.com/G7R0E230QZ6MC0nw0000gP0>*
Social determinants of health (SDOH) are vital to the health outcomes of
patients across the country. It’s especially interesting to see the
variation of those determinants from region to region. This post focuses on
the correlation between median household income and rates of low birth
weight (LBW). It also examines the correlation between income and premature
mortality.

It’s easy to assume that higher income counties experience lower rates of
LBW and lower premature mortality. Likewise, it’s easy to assume that lower
income countries experience just the opposite. But another interesting
metric created here is how well some counties take care of their newborns
and seniors despite their economic circumstances. It’s called the
Punch-Above-Their-Weight Index, or PATWI. Why some counties punch well
above their weight provides a unique perspective into how some health
systems are interacting with their populations.

*#11: Know Your Business Question: A Focus on Readmissions
<http://go.healthcatalyst.com/eP0QNE03000RC200670wg0n>*
Reducing readmission rates is an important area of opportunity for machine
learning (ML) in healthcare. ML can create readmission risk models that
answer a specific question, use specific data, and produce actionable
results. Each component helps drive outcomes improvement (in this case,
better patient quality of life and reduced mortality).

The readmission risk model can address two different situations—risk of
readmission for patients currently in the hospital and risk for discharged
patients—and work with the appropriate data for both. With such specificity
in mind, ML-enabled readmission risk models are most successful when users
understand three key elements:

   1. The definition of the readmission outcomes variable.
   2. The specific use case for the model.
   3. The timing/target of the model.

*#12: Contributing to Open Source Software Development Using Github
<http://go.healthcatalyst.com/q7EC0n0Qw10P6203R0gO000>*
To ensure version control for healthcare.ai, Health Catalyst® uses
collaboration software Git (and the accompanying Github online storage
platform) to support open source software development. Git is a valuable
tool in this process, but it’s language can be difficult to interpret for
all but the most expert developers.

Health Catalyst recognizes the importance of the open source platform, as
well the need for effective version control in the development process, so
it’s providing a need-know-guide to Git language. Examples of important
vocabulary include:

   1. Pulling and pushing: Pull changes from the master source to a local
      copy and push local changes to the online master.
      2. Branches: Used to help keep large changes, feature additions, etc.
      separate from the master source; make ongoing changes to the code in your
      branch, then switch back to the master to use code that known to work.

*#13: Using R for Healthcare Data Analysis
<http://go.healthcatalyst.com/wRQEP6nCw0000g720003P02>*
BI developers in healthcare have a wealth of tools at their disposal, but
multiple tools are not always the most effective solution. In many use
cases, one tool—R—is just as, if not more, productive than splitting
workflow for a single dataset across multiple platforms.

BI professionals who use Excel, SSMS, Tableau, and Qlik (among others) for
each tool’s specific function can often get the same level output with R
alone. R’s capabilities include the following, which have traditionally
required multiple platforms:

   1. Understanding how data is distributed.
   2. Finding out how particular columns are correlated.
   3. Offering pivot tables.
   4. Making histograms or scatterplots.
   5. Grouping by a column of interest and plotting a trend.
   6. Calculating statistics (e.g., standard deviations, t-tests,
   quantiles).
   7. Creating interactive visualizations for others.

*#14: Survey of Deep Learning in Radiology
<http://go.healthcatalyst.com/i32C0PR7gQw00n000Q03E06>*
While machine learning (ML) is gaining attention for its predictive
capabilities with tabular data, it may have greater value in areas of
healthcare that rely more on image data. Radiology and pathology, for
example, use some tabular data, but the core of their work is image data,
such as scans and X-rays.

With deep learning applications in ML (the technology that will enable
self-driving cars and real-time language translation) it’s reasonable to
expect that computers will soon be able to read medical images, accurately
and reliably. Research is already showing promise for deep learning
applications in medical imaging—particularly in breast screening, lung
screening, and brain segmentation. Benefits to patients and health systems
include better care through more efficient and accurate image evaluation,
which can enable lower cost treatment with less burden on resources.

*#15: Feature Engineering in Healthcare Machine Learning
<http://go.healthcatalyst.com/l2R04000w7Qg0E3n60P0R0C>*
The effectiveness of machine learning (ML) in healthcare analytics will
depend largely on building ML models with the right features to drive the
insights users need. In healthcare, a process known as feature engineering
uses domain knowledge of healthcare data to develop accurate ML models.
Feature engineering can be the key differentiator between a useful ML model
and an ineffective one.

Feature engineering enables the ML model to breakdown features granularly,
so that analysts can hone in on features with the greatest predictive
value. For example, using a home address as a predictive feature,
engineering can breakdown the address into specific components (city, zip
code, etc.) to pull information that might yield greater insight into the
patient’s condition and circumstances. Increasingly specific functions of
ML will add to more complete, useful patient datasets and the overall goal
of improving healthcare.




<http://go.healthcatalyst.com/q7EC0n0Qw60P6203R0gT000>

This email was sent to jacob.barhak at gmail.com. If you no longer wish to
receive these emails you may unsubscribe
<http://go.healthcatalyst.com/u/B700rC3EP000603R0Qdwg0n> at any time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://simtk.org/pipermail/popmodwkgrpimag-news/attachments/20170201/a7e623b9/attachment-0001.html>


More information about the PopModWkGrpIMAG-news mailing list