Skip to contents

Estimates drug use periods based on individual drug purchase data, supporting the estimation for individuals with varied purchase patterns and stockpiling. The estimation uses package-specific and Anatomical Therapeutic Chemical (ATC) Classification code -level parameters (latter provided with the package). Optionally, hospitalization data can be incorporated.

Usage

pre2dup(
  pre_data,
  pre_person_id,
  pre_atc,
  pre_package_id,
  pre_date,
  pre_ratio,
  pre_ddd,
  package_parameters,
  pack_atc,
  pack_id,
  pack_ddd_low,
  pack_ddd_usual,
  pack_dur_min,
  pack_dur_usual,
  pack_dur_max,
  atc_parameters,
  atc_class,
  atc_ddd_low,
  atc_ddd_usual,
  atc_dur_min,
  atc_dur_max,
  hosp_data = NULL,
  hosp_person_id = NULL,
  hosp_admission = NULL,
  hosp_discharge = NULL,
  date_range = NULL,
  global_gap_max = 300,
  global_min = 5,
  global_max = 300,
  global_max_single = 150,
  global_ddd_high = 10,
  global_hosp_max = 30,
  days_covered = 5,
  weight_past = 1,
  weight_current = 4,
  weight_next = 1,
  weight_first_last = 5,
  drop_atcs = FALSE,
  data_to_return = "periods",
  post_process_perc = 1
)

Arguments

pre_data

a data.frame or data.table containing drug purchases.

pre_person_id

character. Name of the column containing person id.

pre_atc

character. Name of the column containing ATC code.

pre_package_id

character. Name of the column containing package id.

pre_date

character. Name of the column containing purchase date.

pre_ratio

character. Name of the column containing ratio of packages purchased (e.g., number of packages).

pre_ddd

character. Name of the column containing defined daily doses (DDD) of the purchase.

package_parameters

a data.frame or data.table containing package parameters.

pack_atc

character. Name of the column containing ATC code.

pack_id

character. Name of the column containing package id.

pack_ddd_low

character. Name of the column containing lower limit of daily DDD.

pack_ddd_usual

character. Name of the column containing usual daily DDD.

pack_dur_min

character. Name of the column containing minimum duration of the package.

pack_dur_usual

character. Name of the column containing usual duration of the package.

pack_dur_max

character. Name of the column containing maximum duration of the package.

atc_parameters

a data.frame or data.table containing ATC parameters.

atc_class

character. Name of the column containing ATC class.

atc_ddd_low

character. Name of the column containing lower limit of daily DDD for the ATC class.

atc_ddd_usual

character. Name of the column containing usual daily DDD for the ATC class.

atc_dur_min

character. Name of the column containing minimum duration for the ATC class.

atc_dur_max

character. Name of the column containing maximum duration for the ATC class.

hosp_data

a data.frame or data.table containing hospitalizations.

hosp_person_id

character. Name of the column containing person id.

hosp_admission

character. Name of the column containing admission date.

hosp_discharge

character. Name of the column containing discharge date.

date_range

character. A vector of two dates, expected start and end dates in the drug purchase data.

global_gap_max

numeric. Maximum time between purchases that can be considered as continuous use. Default is 300.

global_min

numeric. Minimum duration of a drug purchase. Default is 5.

global_max

numeric. Maximum duration of a drug purchase. Default is 300.

global_max_single

numeric. Maximum duration of a single purchase. Default is 150.

global_ddd_high

numeric. Maximum daily DDD for a purchase per day for any ATC. Default is 10.

global_hosp_max

numeric. Maximum number of hospital days to be considered when estimating the exposure duration. Default is 30.

days_covered

numeric. Maximum number of days to be added to the exposure duration to cover the gap between purchases. Default is 5.

weight_past

numeric. Weight for the past purchase in sliding average calculation. Default is 1.

weight_current

numeric. Weight for the current purchase in sliding average calculation. Default is 4.

weight_next

numeric. Weight for the next purchase in sliding average calculation. Default is 1.

weight_first_last

numeric. Weight for the first and last purchase in sliding average calculation. Default is 5.

drop_atcs

logical. If TRUE the ATC codes without sufficient DDD or package parameter coverage will be ignored and the process continues with rest of the ATC codes, if FALSE, function execution stops. Default is FALSE.

data_to_return

character. Defines the data to return: drug use periods, updated package parameters, or both. The "periods" returns drug use periods, "parameters" returns updated package parameter file and "both" returns both of the datasets. Default is "periods".

post_process_perc

numeric. Starting percentage for the gap duration to be used in post-processing. If the gap between consecutive drug use periods is at most the specified percentage of the duration of the preceding period, the periods will be connected. The percentage decreases by 0.1 in each iteration. Default is 1.

Value

a dataset or datasets selected at data_to_return. If data_to_return = "periods" function returns a data.table consisting drug use periods with period number, person identifier, ATC, period start and end dates, period duration in days, days spent in hospital during the period, number of purchases, total purchased DDDs and average daily DDD over the period. If data_to_return = "parameters", function returns the original package parameter file with an additional column common_duration that contains them most common duration of each package in drug purchase data. If data_to_return = "both", both datasets are returned as a list of two objects.

Details

Function validates the input data and arguments before the assessment of drug use periods. It will stop execution if issues are detected, with the following exceptions:

  • Up to 10% of missing DDD values per ATC class in the drug purchase data is allowed.

  • Up to 10% of missing package parameter records per ATC class is allowed.

If either threshold is exceeded and drop_atcs = FALSE function stops with error, but with drop_atcs = TRUE ATC classes with insufficient data are ignored, and the function proceeds with the remaining data.

There are five available methods for estimating the duration of each purchase, presented in the order of preference:

  • Continuous use: Based on purchased daily doses (DDDs), temporal average of daily DDDs, and individual purchase patterns.

  • Package-based methods:

    • Package DDD method: Based on purchased DDDs and the usual daily DDD for the specific package.

    • Package duration method: Based on the usual duration of the package, considering the number of the packages (or a proportion of partial package) purchased.

  • ATC-based methods:

    • ATC-level DDD method: Based on purchased DDDs and usual daily DDDs at the ATC level.

    • Minimum ATC duration method: Based on the minimum duration defined for the ATC group.

Periods that are close in time are joined in a post-processing step controlled by post_process_perc. Post processing percentage reduces by 0.1 at each estimation round to prevent long calculation times for large datasets.

Selecting data_to_return = "parameters" the pre2dup calculates the most common package duration for each package from the drug purchase data. Package parameter's usual package duration and usual daily DDD (Total DDDs in package/usual duration) can be updated based on common duration, and pre2dup can be re-run to calculate drug use periods using the updated package parameters.

See also

Drug purchases, parameter files and hospitalizations has their own check functions. The pre2dup runs the checks internally, but checking the validity before running the program is recommended for faster and easier error detection and handling.

check_purchases, check_hospitalizations, check_package_parameters, check_atc_parameters

Examples

period_data <-pre2dup(pre_data = purchases_example, pre_person_id = "id",
 pre_atc = "ATC", pre_package_id = "vnr", pre_date = "purchase_date",
  pre_ratio = "n_packages", pre_ddd = "amount",
   package_parameters = package_parameters_example,
    pack_atc = "ATC", pack_id = "vnr", pack_ddd_low = "lower_ddd",
     pack_ddd_usual ="usual_ddd", pack_dur_min = "minimum_dur",
      pack_dur_usual = "usual_dur", pack_dur_max = "maximum_dur",
       atc_parameters = ATC_parameters, atc_class = "partial_atc",
       atc_ddd_low = "lower_ddd_atc", atc_ddd_usual = "usual_ddd_atc",
        atc_dur_min = "minimum_dur_atc", atc_dur_max = "maximum_dur_atc",
         hosp_data = hospitalizations_example, hosp_person_id = "id",
          hosp_admission = "hospital_start", hosp_discharge = "hospital_end",
           date_range = c("2025-01-01", "2025-12-31"),
            global_gap_max = 300, global_min = 5, global_max = 300,
             global_max_single = 150, global_ddd_high = 10,
              global_hosp_max = 30,days_covered = 5, weight_past = 1,
               weight_current = 4, weight_next = 1, weight_first_last = 5,
                drop_atcs = FALSE,
                data_to_return = "periods",
                 post_process_perc = 1)
#> Step 1/6: Checking parameters and datasets...
#> Checks passed for 'pre_data'
#> Checks passed for 'package_parameters'
#> Checks passed for 'atc_parameters'.
#> Checks passed for 'hosp_data'
#> Preparing hospitalization data and merging overlapping hospitalizations.
#> Step 2/6: Calculating purchase durations...
#> Step 3/6: Stockpiling assessment...
#> Step 4/6: Common package duration calculation was not selected in function call; skipping this step.
#> Step 5/6: Preparing drug use periods...
#> Step 6/6: Post-processing drug use periods...
#> Current post processing percentage: 1
#> Drug use periods calculated. 7 periods created for 5 persons.
#> Returning drug use periods.

period_data
#> Key: <period>
#>    period     id     ATC  dup_start    dup_end dup_days dup_hospital_days
#>     <int> <fctr>  <char>     <Date>     <Date>    <num>             <num>
#> 1:      1      1 N05AH02 2025-01-01 2025-04-14      104                 0
#> 2:      2      2 N05AH02 2025-01-15 2025-04-28      104                 5
#> 3:      3      3 N05AH02 2025-02-01 2025-05-15      104                 0
#> 4:      4      3 N05AH04 2025-01-05 2025-08-26      233                 0
#> 5:      5      4 N05AH02 2025-01-10 2025-04-23      104                 0
#> 6:      6      4 N05AH04 2025-01-20 2025-09-10      233                 0
#> 7:      7      5 N05AH04 2025-01-01 2025-08-22      233                38
#>    dup_n_purchases dup_last_purchase dup_total_DDD dup_temporal_average_DDDs
#>              <int>            <Date>         <num>                     <num>
#> 1:               3        2025-03-08         99.99                     0.961
#> 2:               3        2025-03-22         99.99                     0.961
#> 3:               3        2025-04-08         99.99                     0.961
#> 4:               2        2025-04-15        200.00                     0.858
#> 5:               3        2025-03-17         99.99                     0.961
#> 6:               2        2025-04-30        200.00                     0.858
#> 7:               2        2025-04-11        200.00                     0.858