Depression is a significant challenge for ambulatory care because it worsens health status and outcomes, increases health care utilizations and costs, and elevates suicide risk. An automatic telephonic assessment (ATA) system that links with tasks and alerts to providers may improve quality of depression care and increase provider productivity. We used ATA system in a trial to assess and monitor depressive symptoms of 444 safety-net primary care patients with diabetes. We assessed system properties, evaluated preliminary clinical outcomes, and estimated cost savings. The ATA system is feasible, reliable, valid, safe, and likely cost-effective for depression screening and monitoring for low-income primary care population.
DEPRESSION impairs functional status, worsens clinical outcomes, increases health care costs, and is a major contributor to the burden of suicide (Ferrari et al., 2013). In safety-net primary care settings, depression diagnosis, treatment, and follow-up have been suboptimal (Ani et al., 2009), despite evidence that regular depression screening and adaptive pharmacotherapy or counseling can be effective in improving depression symptoms, medication adherence, and patients’ self-care (Bell et al., 2011; Ell et al., 2010, 2011; Lin et al., 2004).
To narrow the gap of implementing evidence-based care, there is growing evidence that automated telecommunications systems may reduce provider labor costs and be equally effective for certain chronic illness care management, such as patient symptom monitoring and self-care reminders (Friedman et al., 1997; Kim et al., 2007; Kroenke, Theobald, et al., 2010; Mahoney et al., 1999; Piette, 2000; Piette et al., 2000; Xu et al., 2012). However, no study has described a depression telecommunications system design that is fully integrated with a clinical information system to support patients and providers in routine primary care practice. This study discusses the design of a patient-informed health IT (HIT) system that can efficiently integrate data and successfully engage care providers as needed to deliver additional care and support to in-need patients.
The study also describes system testing to ensure its ability to access patients, obtain valid and complete data, and respond in a timely manner to patients at elevated risk of suicide. These analyses inform whether the system is effective and safe for clinical use. Potential labor cost saving is estimated. The data are from a safety net Diabetes-Depression Care-management Adoption Trial (DCAT) (Wu et al., 2013). The trial targeted patients with diabetes to test the technology solution because they are twice as likely to experience clinically significant depressive symptoms as the general population, especially among urban, low-income patients (Egede et al., 2002;Richards, 2011).
DESIGN OF AUTOMATED TECHNOLOGY FOR CARE MANAGEMENT
System design principles
Four principles guided the DCAT automatic telephonic assessment (ATA) system design: (1) customizable to be patient-centered, (2) flexible for clinical purposes, (3) integrated into the existing clinical information system, and (4) capable of coordinating team-based care workflow. These design principles address barriers to adoption and sustainability of the evidence-based collaborative depression care (Archer et al., 2012; Bell et al., 2011; Katon et al., 2010; Palinkas et al., 2011). The aim is to bring out the humanity in care management through patient-informed, proactive automated evidence-based data mining and patient outreach to assist patients in maintaining their peace of mind, and obtaining or receiving assistance when needed.
ATA system that links with tasks and alerts to providers should shift the burden of routine work to machines and engage the provider as needed; thus, ATA system might facilitate better quality of care while reducing labor costs. With ATA system outbound calls, the burden is not on patients, and clinicians do not waste time on unfruitful call attempts. Preconfigured, easily modified schedules allow for reliable calls. If calls are unanswered, ATA system attempts to reach patients many times with no human resource expense. ATA system call contents can be customized by clinical purpose (eg, monitor symptoms, medication adherence, patient self-care) (Bauer et al., 2013). Patient preferences (eg, language, call time, frequency) can be tailored to maximize reach.
The DCAT algorithmic architecture consists of the ATA patient call system and a Web-based provider notification, tasking, and alert system (Figure 1) integrated into an existing diabetes disease management registry (DMR) of the study site.
ATA modules and activation
The DCAT ATA automatically calls patients for depression screening or monitoring. “Amy,” the automated call system persona, uses a natural speaking voice. ATA system has a modular design that currently includes 6 independent modules: Patient Health Questionnaire (PHQ) 2 or 9 items (Huang et al., 2006; Kroenke, Spitzer, et al., 2010; Wittkampf et al., 2007), 1-item pain assessment, 4-item antidepressant medication (AM) adherence assessment, 2-item psychotherapy practice assessment, 2-item depression self-care activity prompting, and 2-item patient request for contact from a clinician or a bilingual (English/Spanish) project assistant (PA). The clinician contact request item allows patients to discuss their unmet needs or questions about the diabetes care or depression concerns. Each module can be automatically activated on the basis of the patient’s medical history and current information from patient electronic records.
ATA screening and monitoring calls
During DCAT, ATA system screened nondepressed diabetic patients quarterly to facilitate early detection of depression and monitored depressed patients monthly. The outbound calls were on a dynamic schedule on calendar date, patient-preferred call days and time, and call history. Patients could opt for password-protected access and could use the system to reschedule the call or request human follow-up. Screening calls included 4 of the 6 aforementioned modules: (1) depressive symptoms screening with PHQ-2 (Kroenke et al., 2003), followed by the full PHQ-9 if PHQ-2 score is more than 2, (2) pain assessment, (3) self-care activity prompting, and (4) patient request for provider contact. Monitoring calls included PHQ-9 assessment for symptom monitoring, plus the other 5 modules mentioned previously. However, the AM adherence assessment was triggered only if a patient’s medical record showed an active AM prescription; likewise, the psychotherapy practice assessment module was triggered only for a record of psychotherapy.
In the screening call, if a patient’s PHQ-9 score was more than 8, he or she was identified as a potentially depressed patient and began receiving monthly monitoring calls. If a depressed patient had a PHQ-9 score of less than 8 in 6 consecutive monitoring calls, he or she was transferred into the nondepressed group and received quarterly screening calls.
ATA system used an automated speech-recognition (ASR) technology or an interactive voice recorded system with touchtone response for English speakers, but only interactive voice recorded system for Spanish speakers. (The ASR technology has been proven to record 98%-100% of English responses correctly but correctly records only 70%-80% of Spanish responses.) If a patient refused ATA calls or did not complete critical data such as PHQ scores, a human caller contacted the patient. Only ATA calls were used in the property evaluation discussed later.
Provider task and suicidal alert system
The ATA results were automatically integrated into the diabetes DMR for clinician review. If a human made the call, he or she manually entered the information.
The DMR was enhanced to include a provider task system that triggered tasks in response to specific issues identified from ATA calls: patient requests for contact due to data capture problems were assigned to the PA; patient requests for contact due to care issues or to high PHQ scores and AM adherence issues went to the patient’s care manager; and requests due to high PHQ scores but no AM issue went to the social worker. Care managers could trigger tasks to consult a physician. Care managers recorded task results via a user-friendly structured document function.
The DCAT ATA system included a unique suicidal alert system. Three dedicated physicians responded to the alerts in a waterfall fashion, using a predefined protocol that included information on community response resources (eg, hotlines and language interpreter services). If a patient scored the last item of the PHQ-9 as 2 or 3 (indicating high intention of self-harm), the system immediately sent a short message service (SMS) text message and an e-mail to the first physician. If the first physician did not acknowledge the alert within 15 minutes, the system alerted the second physician, and so on until a physician took responsibility for the call. ATA system provided patient demographic, contact information, and PHQ-9 results to the responding physician, who then called the patient, assessed his or her status, and responded appropriately.
DCAT PROGRAM DESCRIPTION
The applied, quasi-experimental DCAT study was conducted in collaboration with the Los Angeles County Department of Health Services at Ambulatory Care Network–Research & Innovation Practice Based Research Network clinics targeting low-income, racially/ethnically diverse diabetic patients. (The Web site ishttp://ladhs.lacounty.gov/wps/portal/ACN/.) Wu et al. (2013) described the trial, which reflected 3 care delivery modes: usual care (UC), supported care (SC), and technology-facilitated care (TC). Usual care is the status quo, the most common mode in safety-net clinics. Both SC and TC included a Department of Health Services Disease Management Program to support diabetes care with protocol-based depression screening and treatment (Bodenheimer et al., 2002).
The ATA system was tested and implemented in the TC arm with 444 patients who had the following characteristics: 62% female, average age 52.6 years, 91% Hispanic/Latino, 26% PHQ score more than 10 at baseline. The study recruiters demonstrated ATA calls to participants during recruitment and assessed their preferences (eg, language, call time, password-protected access). The PA configured each call in the DMR; an algorithm-driven rule engine then processed DMR clinical and patient preference data to determine automated call characteristics for each patient (including frequency of call, applicable modules and questions to be asked, language, and call time). Patients then received periodic calls, assessing depression symptoms and treatment adherence, with responses automatically documented in the DMR. If patients exhibited depressive symptoms, self-harm intention, or concern about medication, automated task or alerts would engage providers (eg, nurse care managers, social workers, emergency responders) to provide appropriate care management to in-need patients.
SYSTEM EVALUATION METHODS AND RESULTS
We evaluated 4 critical system properties to ensure that the ATA system could be effective for clinical use. “Feasibility” assessed ATA access to patients. “Reliability” assessed consistent retrieval of requested information so proper follow-up can be triggered. “Validity” compared machine versus human modes of administration to ensure no differential system reporting. “Safety” compared the frequency and response time of suicidal alerts to established expectations. Methods and results for the system properties evaluation are described later and summarized in Table 1. We also modeled ATA system potential for reducing costs in the safety net environment.
To assess the critical factor of ATA system access to patients, we analyzed the rate of completed calls per month by language. A call was defined as complete if it reached the patient and recorded answers to all questions asked. The PA followed up with patients to explore reasons, including technology-related issues, for incomplete calls to ensure that patients could complete calls successfully without technical problems and to determine potential improvements.
During the first 11 months of the trial, the 444 patients on average received 4.13 calls (maximum 10). Of the 2078 calls, 51.6% were completed (55.9% of 387 English calls and 50.7% of 1691 Spanish calls). The completed call rate was stable when calculated month to month. In the initial 6 months, 76.6% of patients had at least 1 completed call. After 10 months, 94.8% of patients had received at least 1 call and 79.7% of patients had completed at least 1 call.
We contacted all 248 patients with an incomplete call for this analysis but reached only 186 (75%). Half of those (51%) cited inconvenient call time as the reason for not completing the automated call; other reasons included preference for human calls, bad cell phone connection, disconnected phone, nonworking phone, and/or personal reasons. Patients who requested human calls were then switched from automated calls to human calls (which were not included the analysis).
In the first 11 months of the trial, a quarter (120/444 patients, or 27%) requested a call from the PA during their automated call. Some patients had accidentally answered “yes” to the PA call request question; others wanted the PA to clarify the study. Only 7 patients encountered technical problems with the ATA system, indicating patients’ ability to complete calls successfully. The PA was unable to reach 10 patients, whom we exclude from this analysis.
Response capture rate
We used patient response data to examine consistent retrieval of requested information by ATA system. Response capture rate was calculated for each assessment question by dividing the total number of times a question was asked and captured by the total number of times the question was asked. We then averaged the response rates for all questions contained within an ATA module.
We examined 2078 automated calls made in 11 months. The average response rates were high in every module: depressive symptoms (99%), pain (100%), self-management activities (95%), patient call requests (99%), AM adherence (99%), and psychotherapy (99%).
ASR technology captured data
To ensure that the ASR technology captured patients’ intended responses, we analyzed responses of 82 calls completed in the first 3 months of DCAT by listening to the audiorecorded calls and comparing them with the captured responses in the DMR. Error rates were calculated on the basis of the total number of incorrect responses, including those not captured by ATA system. Each error was analyzed to determine whether the incorrect capture impacted the patient’s care.
Impactful errors included (1) downcoding of the PHQ-9 response, potentially leading to false-negative diagnosis of depression, and (2) upcoding of responses to medication questions, causing the patient’s medication adherence to seem higher than actuality.
Error rates were 14.5%; impactful errors accounted for 9.79% of the total error. To reduce error rates, we added prompts in the call to remind patients of the response categories, and we expanded acceptable response categories to better capture terms that patients use. We conducted a second analysis of 1 month of calls (19 calls) completed after we made the changes. The overall error rate decreased to 7.7%. Impactful errors were also reduced and accounted for only 4.17% of total errors.
We compared 58 pairs of PHQ-9 scores recorded from ATA calls with scores for the same patient acquired within 7 days from clinicians (33), outcome interviews (15), or baseline assessment (10) conducted by patient recruiters. Patient Health Questionnaire–9 scores were coded into a 5-level severity variable (0-4, 5-9, 10-14, 15-19, and ≥20) since there could be some variation in scores when assessing symptoms. The Spearman correlation and Wilcoxon signed rank sum for ordinal variables were used to test the correlation and difference of the 2 PHQ scores. The results showed that ATA system-captured scores were statistically significantly (at α = 0.05) correlated with those obtained from other sources, and there was no statistically significant difference in their severity ranking.
We collected all suicide intention alerts, along with the name of the responding physician and the timestamps for the times the call was completed, the alert was sent, and the alert was responded to. We then analyzed the alert response time (ie, the elapsed time between each completed call that triggered the alert and the physician’s response).
There were 25 suicide intention alerts out of 2189 calls (1224 calls with completed PHQ) during the 11-month trial. They were from 21 patients; 1 patient triggered 3 alerts, 2 patients triggered 2 alerts each, and 18 patients triggered 1 alert each. One alert was an outlier and was excluded from the analysis. The average response time (32 minutes) was within the trial’s 75-minute target response time (comprising alert polling interval of 30 minutes and three 15-minute response intervals). Only 3 alerts exceeded the target response time. The maximum alert response time was 2 hours 4 minutes.
We modeled the costs by assuming that a nurse care manager or the ATA system would attempt to call every diabetic patient by phone to assess PHQ and, when applicable, medication adherence. We estimated the time and cost to successfully reach a patient who would be in need of depression care. The process would require multiple call attempts to reach a diabetic patient and complete a depression assessment. DCAT baseline data indicated that approximately 30% of the diabetic patients would also need depression care (Wu et al., 2013). The data sources for the cost evaluation also included the ATA call records and a survey of 12 providers taking part in DCAT who estimated the time (in minutes) to screen for depression using a PHQ scale and to assess medication adherence and other issues in a phone call. The average cost to identify, reach, and complete a call for a patient in need of depression care for the UC approach was $35; for the ATA system, it was about $1 excluding the currently unknown cost of the technology.
This study contributes the system design of an innovative, patient-centered, automated, integrated HIT approach to facilitate ambulatory depression care management, and a methodology to test such a system. Based on the design blue print, the DCAT demonstrated how the ATA system worked. The test results show that ATA system is a feasible way to access safety-net diabetes patients to obtain reliable and valid patient-informed data for depression screening/monitoring; results also demonstrate that ATA system effectively engages providers, including as responders to suicidal alerts. Labor costs are significantly reduced.
Clinical outcomes must await completion of the study. But at 6 months after DCAT implementation, we have compared 3 key depression clinical outcomes of patients in the ATA system intervention arm (TC) with outcomes in the 2 comparison arms (UC and SC) for (1) reduction of PHQ-9 score at 6 months from baseline, (2) reduction of the percentage of patients with major depression at 6 months from baseline, and (3) the percentage of patients depressed at baseline who were in remission at 6 months. A comparison of the descriptive statistics with 95% confidence intervals suggests that at 6 months the percentage of patients with major depression had been reduced more in the TC group than the others: TC went from 26% to 13% versus UC and SC from 28%–30% to 22%–24%. Positive trends in PHQ-9 score reduction and remission rates also appear greater for the TC patients, but full analysis will not be completed for another 6 months of follow-up.
These findings are consistent with previous studies with different populations, conditions, or assessment technology capabilities (Kim et al., 2007; Kroenke, Theobald, et al., 2010; Mahoney et al., 1999; Piette, 2000;Piette et al., 2000; Xu et al., 2012).
Our findings have several important implications for TC management. ATA system is able to measure patients’ depressive symptoms as well as alternative manual methods, which show its potential to add value to the care management process not only for depression but also for other stigmatized medical conditions (Xu et al., 2012). Our data regarding suicide or self-harm alert frequency and response times can help establish a baseline for future system design and workload expectations to include critical suicidal risk assessment. Even with patient preferences considered in the automatic outreach call design, ATA system still missed 20% of the patients over 11 months; this result warrants further research to ensure clinical applicability of the system for real-world implementation. Although we are encouraged by the favorable preliminary outcomes, we emphasize that automation is only meant to improve efficiency in data collection and facilitation of clinical and team workflow. Quality and outcomes can be improved only if a technology can amplify the humanity of the health care system.
Although the study estimated that the potential labor cost savings of using ATA system to support depression care are manifold, the cost of the technology itself needs to be taken into account. The cost-effectiveness of the technology system, then, would depend on the size and characteristics of the patient base it serves, patients’ and providers’ acceptance and adoption of the technology, the existing HIT infrastructure and capabilities, and duration of program implementation.
Limitations and future research
1. ATA system was tested in an urban, low-income, predominantly minority population of patients in a diabetes disease management program. The feasibility of this system to reach general primary care population should be further studied.
2. Data were collected after an initial 10 months of use. Longer-term follow-up is needed to demonstrate ATA system use on a regular basis to assist clinical care.
3. There were no clinical guidelines or literature data to guide the time interval between responders in the suicidal alert system. The 15-minute interval was based solely on clinical judgment of providers we consulted. The clinical appropriateness of the time interval should be further studied.
4. We have not yet fully assessed DCAT cost-effectiveness because the plan is to analyze it after a 12-month follow-up to ensure a sufficient basis for analysis.
5. Time did not permit full evaluation of differences in depression outcomes between TC, SC, and UC. Future analysis will take into account differences in patient and practice characteristics between study arms.
DCAT demonstrated that ATA system is feasible, reliable, and clinically valid for the purpose of depression screening and monitoring in the safety-net population. Preliminary cost and clinical outcomes are also promising. Although ATA system is unlikely to supplant the role of providers, it acts as a very useful complement to provider-based depression screening and offers a means to target individuals for additional professional assessment. It can improve adoption of depression screening and monitoring in the primary care setting by shifting the burden of routine tasks from time-pressured providers to the ATA system. With more efficient monitoring, ATA system facilitates more timely depression treatment, allowing clinicians to provide more compassionate care and improving patient outcomes. ATA system has the potential to make health care more patient-centered by improving depression monitoring and care management even in resource-constrained settings such as safety-net clinics.
Ani C., Bazargan M., Hindman D., Bell D., Rodriguez M., Baker R. S. (2009). Comorbid chronic illness and the diagnosis and treatment of depression in safety net primary care settings. Journal of the American Board of Family Medicine, 22(2), 123–135. doi:10.3122/jabfm.2009.02.080035
Archer J., Bower P., Gilbody S., Lovell K., Richards D., Gask L., Covertry P. (2012). Collaborative care for people with depression and anxiety. Cochrane Database of Systematic Reviews, 10, CD006525.
Bauer A. M., Schillinger D., Parker M. M., Katon W., Adler N., Adams A. S., Karter A. J. (2013). Health literacy and antidepressant medication adherence among adults with diabetes: The Diabetes Study of Northern California (DISTANCE). Journal of General Internal Medicine, 28(9), 1181–1187.
Bell R. A., Franks P., Duberstein P. R., Epstein R. M., Feldman M. D., Fernandez Y., Kravitz R. L. (2011). Suffering in silence: Reasons for not disclosing depression in primary care. Annals of Family Medicine, C(5), 439–446. doi:10.1370/afm.1277
Bodenheimer T., Wagner E. H., Grumbach K. (2002). Improving primary care for patients with chronic illness: The chronic care model, Part 2. The Journal of the American Medical Association, 288, 1909–1914.
Egede L. E., Zheng D., Simpson K. (2002). Comorbid depression is associated with increased health care use and expenditures in individuals with diabetes. Diabetes Care, 25(3), 464–470.
Ell K., Katon W., Xie B., Lee P. J., Kapetanovic S., Guterman J., Chou C. P. (2010). Collaborative care management of major depression among low-income, predominantly Hispanic subjects with diabetes: A randomized controlled trial. Diabetes Care, 33(4), 706–713. doi:10.2337/dc09-1711
Ferrari A. J., Charlson F. J., Norman R. E., Patten S. B., Freedman G., Murray C. J. L., Whiteford H. A. (2013). Burden of depressive disorders by country, sex, age, and year: Findings from the Global Burden of Disease Study 2010. PLOS Medicine, 10(11), e1001547. doi:10.1371/journal.pmed.1001547.
Friedman R. H., Stollerman J. E., Mahoney D. M., Rozenblyum L. (1997). The virtual visit: Using telecommunications technology to take care of patients. Journal of the American Medical Informatics Association, 4(6), 413–425.
Huang F. Y., Chung H., Kroenke K., Delucchi K. L., Spitzer R. L. (2006). Using the Patient Health Questionnaire–9 to measure depression among racially and ethnically diverse primary care patients. Journal of General Internal Medicine, 21, 547–552.
Katon W. J., Lin E. H. B., Von Korff M., Ciechanowski P., Ludman E. J., Young B., McCulloch D. (2010). Collaborative care for patients with depression and chronic illnesses. New England Journal of Medicine, 363, 2611–2620.
Kim H., Bracha Y., Tipnis A. (2007). Automated depression screening in disadvantaged pregnant women in an urban obstetric clinic. Archives of Women’s Mental Health, 10(4), 163–169. doi:10.1007/s00737-007-0189-5
Kroenke K., Spitzer R. L., Williams J. B. (2003). The Patient Health Questionnaire–2: Validity of a two-item depression screener. Medical Care, 41(11), 1284–1292. doi:10.1097/01.mlr.0000093487.78664.3c
Kroenke K., Spitzer R. L., Williams J. B. W., Lowe B. (2010). The Patient Health Questionnaire somatic, anxiety, and depressive symptom scales: A systematic review. General Hospital Psychiatry, 32, 345–359.
Kroenke K., Theobald D., Wu J., Norton K., Morrison G., Carpenter J., Tu W. (2010). Effect of telecare management on pain and depression in patients with cancer: A randomized trial. The Journal of the American Medical Association, 304(2), 163–171. doi:10.1001/jama.2010.944.
Lin E. H., Katon W., Von Korff M., Rutter C., Simon G. E., Oliver M., Young B. (2004). Relationship of depression and diabetes self-care, medication adherence, and preventive care. Diabetes Care, 27(9), 2154–2160.
Mahoney D., Tennstedt S., Friedman R., Heeren T. (1999). An automated telephone system for monitoring the functional status of community-residing elders. Gerontologist, 39(2), 229–234.
Palinkas L. A., Ell K., Hansen M., Cabassa L., Wells A. (2011). Sustainability of collaborative care interventions in primary care settings. Journal of Social Work, 11(1), 99–117.
Piette J. D. (2000). Interactive voice response systems in the diagnosis and management of chronic disease. American Journal of Managed Care, 6(7), 817–827.
Piette J. D., Weinberger M., McPhee S. J., Mah C., Kraemer F. B., Crapo L. M. (2000). Do automated calls with nurse follow-up improve self-care and glycemic control among vulnerable patients with diabetes? American Journal of Medical Electronics, 108, 20–27.
Richards D. (2011). Prevalence and clinical course of depression: a review. Clinical Psychology Review, 31, 1117–1125.
Wittkampf K. A., Naeije L., Schene A. H., Huyser J., van Weert H. C. (2007). Diagnostic accuracy of the mood module of the Patient Health Questionnaire: A systematic review. General Hospital Psychiatry, 29, 388–395.
Wu S., Ell K., Gross-Schulman S., Sklaroff L. M., Katon W., Nezu A., Guterman J. (2013). Technology-facilitated depression care management among predominantly Latino diabetes patients within a public safety net care system: Comparative effectiveness trial design. Contemporary Clinical Trials, S1551-7144(13), 00173–0. doi: http://dx.doi.org/10.1016/i∼.cct.2013.11.002j