Practicum Evaluation Insights

01

Where This Came From

I saw this problem from the inside.

As a staff member in a field education program, I worked with Lumivero’s practicum placement software every day. It did exactly what it was built to do. It housed active placement sites, managed forms, and handled the administrative backend work.

What it did not do was help the team review agency patterns over time. After each practicum cycle, student evaluations were reviewed, what worked and what did not was summarized, and the results were filed away. There was no clear way to compare patterns across agencies, no trend view, and no early sign that a placement site was starting to decline.

That became harder to ignore through two parts of my work. One was reading through the most difficult student evaluations. The other was speaking with an alumna whose practicum experience nearly pushed her away from the profession entirely.

Then came accreditation work

While working on our program's accreditation review, I found something important that focused the direction of this project. We were meeting CSWE benchmarks across the competencies. But I kept coming back to the same question. What does that actually tell us about what students are experiencing at their practicum sites?

What this project addresses

The evaluation process provided useful information, but the review structure did not clearly track agency patterns over time. This workflow was built to address that.

"She told me her practicum had been so discouraging it nearly deterred her from wanting to practice social work at all."

An alumna of the program

What the annual process looked like

1

Students complete evaluations

End-of-practicum surveys collected at the close of each placement cycle.

2

The team summarizes the results

Highlights from the strongest and weakest experiences are identified manually.

3

Results are filed away

There's no multi-year data compilation, no agency-level patterns, and the upcoming cycle lacks historical insight.

!

No infrastructure to act on patterns

Data was being collected, but there was no system in place for ongoing reviews.

02

The Problem

Important practicum concerns can be hard to spot before the placement ends.

Field education is the signature pedagogy of social work. Students complete at least 450 hours at a placement site, usually around 15 to 20 hours each week. A strong placement can build confidence and professional judgment. A weaker one can leave students underused, underprepared, or questioning whether they want to stay in the profession.

No systematic review across agencies and years

Evaluations are collected and filed, but there is no consistent process for comparing patterns across agencies and over time. Each cycle begins with limited visibility into what happened before.

No year-to-year trend monitoring

A site may begin to decline across multiple student cohorts, and that pattern can remain difficult to see because there is no structured way to track it over time.

No early sign that a site is starting to weaken

Concerns often become clear only after the placement has ended, when students have already completed the experience, and nothing about that placement can be changed.

This was not a lack of effort

The effort to collect and review evaluations was there. What did not exist was the structure needed to turn that information into agency-level monitoring. This project builds that structure.

The Benchmark Question

Meeting competency benchmarks does not mean students had a strong placement experience.

CSWE requires programs to assess student competence, which is important. But competency success and placement quality measure different things, and neither can substitute for the other.

What competency benchmarks measure

Whether students met the expected learning outcomes
Formal documentation of program standards
CSWE accreditation compliance

What placement quality measures

Whether the agency environment supported real development
How students rated supervision and learning opportunities
Whether students would recommend the site to a future student

03

What This Project Uses

12 years of evaluation data from one social work program.

The University of Montana School of Social Work has collected end-of-practicum evaluations from students each year. Before building the pipeline, the dataset needed to be reviewed, cleaned, and profiled so the thresholds and review signals reflected real patterns in the data rather than arbitrary cutoffs.

The cleaned dataset covers academic years 2014-2015 through 2025-2026. It includes both structured Likert-rated survey responses and open-ended student feedback. Each row represents one student evaluation of one agency in one academic year.

Dataset at a glance

Item	Value
Total evaluations	2,258
Practicum agencies	222
Agencies with sufficient data	217
Academic years covered	12 (2014–2026)
BSW evaluations	856 (37.9%)
MSW evaluations	1,402 (62.1%)
Mean Placement Quality Score	3.83 (SD 0.46)
Score range	2.49 to 4.93

Distribution of Placement Quality Scores across 217 agencies

Most agencies scored above 3.5. The lower end of the distribution had more review flags.

BSW and MSW mean Placement Quality Scores across six measures

MSW students rated placements higher on all six measures. Three of those differences were statistically significant using Welch’s t-test at p < 0.05.

04

How the Pipeline Works

One command. A repeatable, structured review process.

Running python run_all.py executes the full pipeline in order. The initial setup takes less than an hour. Annual updates take less than five minutes.

The pipeline is documented clearly enough that a staff member with no prior Python experience can replicate it using the public GitHub repository or run it interactively in Google Colab . No local installation is required for the Colab version.

Pipeline steps

1

Load and clean

Validate files, standardize agency names, and encode Likert items

2

Score & tag

Build the Placement Quality Score, run VADER sentiment analysis, and apply theme tags

3

Agency profiles

Create one row per agency with scores, review flags, themes, and trend results

4

Figures and tables

Generate static report figures, descriptive tables, and trend charts

5

Dashboard

Launch the Streamlit review interface for field education leadership

A review flag appears when two or more signals point in the same direction at the same time.

Requiring two or more overlapping signals helps reduce false alarms and keeps the review focused where concern is more likely to reflect a real pattern. A single signal on its own could be noise, such as a small cohort or an unusual year. When multiple signals overlap, the workflow is identifying something worth a closer look.

64 agencies triggered the review flag

29.5% of the 217 agencies with sufficient data had two or more simultaneous concern signals active. All thresholds are stored in config.py and can be adjusted without modifying the pipeline logic.

The five review signals

Placement Quality Score below 3.5

A measure built from Likert items covering supervision, learning goals, and practice readiness. This marks the lower portion of the distribution.

Recommendation rate below 70%

The share of students who said yes when asked whether they would recommend the agency to a future student.

Sentiment score below 0.05

VADER sentiment analysis applied to open-ended comment fields. This captures near-neutral or negative language in professional evaluative writing.

Administrative overload above 40%

A theme tag is applied when students describe paperwork, shadowing, or limited direct practice as the dominant experience.

Declining trend

An agency’s recent Placement Quality Score dropped more than 0.30 points below its own historical average.

05

Key Findings

Placement quality patterns are there when you know how to look for them.

The analysis combined structured ratings, recommendation data, sentiment scores, and open-ended themes from 2,258 responses. Three findings shaped the workflow and the way the results are reviewed.

Review flags cluster where both signals are low

Maroon triangles show flagged agencies. Gray dots show unflagged agencies. The shaded area indicates agencies where both the Placement Quality Score and the recommendation rate fall below the review thresholds.

Finding 1

Review flags were more common where both signals were weaker.

Agencies with lower Placement Quality Scores also tended to be agencies where fewer students would recommend the site to a future student. When both measures were weaker simultaneously, review flags were more likely to appear.

64

agencies flagged

29.5%

of agencies profiled

Finding 2

The relationship between competency benchmarks and placement quality was essentially zero.

r = −0.009

p = 0.89 across 217 agencies

Knowing an agency’s competency score tells you almost nothing about how students rated the placement experience. Both measures matter, but they do not measure the same thing.

Catholic Charities

Competency score 91.0 → Placement Quality Score 2.90

Programs can meet benchmarks despite students reporting weaker practicum experiences.

Peachtree Pediatric Therapy Center

Competency score 88.4 → Placement Quality Score 3.20

Programs can meet benchmarks despite students reporting weaker practicum experiences.

What stronger and weaker placements looked like in student language

Students described stronger placements in terms of supervision, learning, and real practice. They described weaker placements more often in terms of administrative overload, weaker structure, and limited client contact

Program-wide Placement Quality Score, 12 years

2025-2026: first time below the 3.5 concern threshold in 12 years of data. The program had no way to know until this pipeline ran.

Finding 3

The year-to-year view revealed something the usual review process had not.

The workflow revealed a pattern that wasn't clear before. In 2025-2026, the Placement Quality Score fell to 3.41, the first time in 12 years it dropped below 3.5.

3.41

Program-wide Placement Quality Score in 2025-2026
First time below 3.5
in 12 years of data

68 additional agencies

showed a declining trend before reaching the full review-flag threshold That added an earlier sign that did not exist in the prior review process.

06

See It in Action

This is what a field education team sees when opening a flagged agency profile.

Select any flagged agency from the dropdown to view the full profile produced by the workflow. Each field comes directly from the output, including scores, review signals, theme breakdown, trend direction, and representative student language.

Agency

▼

⚠ Review recommended Declining trend

Theme profile - least-helpful responses

Most-helpful phrases

Least-helpful phrases

Placement Quality Score - trend over time

This is a preview. The live dashboard shows all 222 agencies. You can easily filter, drill down, and compare them. Open full dashboard

07

Interactive Dashboard

Explore the dashboard without leaving this page.

Six views, sidebar filters, agency profiles, trend monitoring, language patterns, and benchmark alignment are all embedded here for review.

For the fully interactive dashboard with the most current data from the latest workflow run, open the live Streamlit dashboard directly

SSW Placement Quality Dashboard ↗

Six views in the dashboard Overview, Flagged Agencies, Trend Monitoring, Language Patterns, Agency Profile, and Benchmark Alignment

Updated after every pipeline run Run


                python run_all.py

, and the dashboard reflects the latest evaluation data automatically.

FIELD EDUCATION REVIEW DASHBOARD

Open full page ↗

Embedded preview above shows selected views. For full functionality including all 222 agencies and live filtering, open the SSW Placement Quality Dashboard ↗

08

Implications

This issue exists wherever programs collect evaluation data without a clear way to review it over time.

The pipeline was built using data from a single program. But the issue it addresses is not limited to one program, or even to social work.

Social work programs

897 accredited programs

CSWE requires programs to evaluate the effectiveness of field-based settings. Most collect this data every year. Very few have a structured way to use it for agency-level review.

Beyond social work

Any practicum, clinical, or internship program

Nursing, counseling, education, and public health all face the same problem. Evaluation data is collected each year, but patterns remain difficult to see without a monitoring process.

Placement platforms

Sonia Software, Experiential Learning Cloud, Inter Placement Tracking

These platforms house the agencies and manage placements. However, they currently do not track placement quality over time. They need a pipeline to monitor this quality, which can be added to their existing setup instead of running it separately.

Standalone or integrated

The pipeline runs as a standalone workflow for any program that exports evaluation data. It can also be integrated into existing placement management software as a monitoring feature, turning what programs are already required to collect into something they can actually act on.

09

For Your Program

Your program already collects this data, but it hasn’t established a clear way to review it over time.

If your program collects end-of-practicum evaluations with Likert-rated items, open-ended student comments, and agency identifiers, this workflow can be adapted to your data format. The logic does not depend on UMSSW-specific fields. It depends on having structured ratings, narrative text, and a way to identify which agency each response came from.

01

Map column names

One configuration file. One-time setup. No code changes needed.

One time only

02

Run on historical data

The first full pipeline run processes all historical evaluation files.

Under one hour

03

Review thresholds

Adjust the review thresholds to match your distribution. Stored in config.

Adjustable anytime

04

Annual updates

Add a new evaluation file and rerun the workflow. The dashboard refreshes automatically.

Under five minutes

Ready to adapt this for your program?

The first step is a conversation about your current evaluation instrument.

Get in touch

10

Limitations and Responsible Use

A review flag is a prompt for conversation, not a verdict.

This workflow supports review. It does not replace the professional knowledge, relationships, or judgment that field education leadership brings to agency decisions.

Small cohort sensitivity

Agencies with fewer than three evaluations in a given year are excluded from that year’s analysis. Scores based on very small cohorts can shift dramatically because of individual student fit rather than a broader agency pattern.

Institutional context matters

Agencies that have gone through mergers, staffing transitions, or other organizational changes may show score declines that reflect disruption rather than a lasting change in placement quality. Review flags should be interpreted with that context in mind.

The pipeline cannot determine cause

A review flag means multiple signals were active at the same time. It does not explain why. The Field Education Team’s history with an agency, including site visits, supervisor relationships, and student feedback, is what turns a signal into a meaningful judgment.

Selection effects

Some agencies attract students who arrive more prepared because they offer pay, qualify as employment-based placements, or are easier to access. Strong competency scores at those sites may reflect student preparation as much as placement quality. The pipeline shows the pattern. It cannot determine the cause.

Not a ranking system

Agency profiles are not rankings, and they are not public. They are structured inputs for a leadership review conversation guided by the Field Education Team. The goal is to help programs ask more focused questions earlier, not to automate decisions that belong to people.

11

About This Project

Built inside the process, it was designed to improve.

This project grew out of my role as a staff member in a field education program, specifically from watching evaluation data collected, summarized, and filed away each year without a way to build on it year to year.

The method used here, a multi-signal review pipeline applied to structured evaluation data, is not limited to social work. It can be adapted to any program that collects structured evaluations of practicum or internship sites.

The full code and documentation are publicly available. The pipeline is designed to be handed off. Any staff member with the evaluation export file and access to the repository can replicate or update the analysis.

Tools and methods

Python VADER Sentiment Streamlit Pandas Matplotlib Scipy Welch t-test Theme lexicon

Project links

Live SSW Placement Quality Dashboard ↗ GitHub Repository ↗ Google Colab ↗

CONTACT

Tomas Hernandez

MSBA Candidate · University of Montana

tomas.hernandez@umontana.edu