rpact: Confirmatory Adaptive Clinical Trial Design and Analysis

rpact provides the function getSimulationSurvival for simulation of group-sequential trials with a time-to-event endpoint. For a given scenario, getSimulationSurvival simulates many hypothetical group-sequential trials and calculates the test results. Based on this Monte Carlo simulation, estimates of key quantities such as overall study power, stopping probabilities at each interim analysis, timing of analyses etc. can be obtained.

getSimulationSurvival complements the analytical calculations from function getSampleSizeSurvival in multiple ways:

  • Simulations can be used to assess the accuracy of the analytical formulas.
  • Simulations allow to answer questions such as the following:
    • How variable is the timing of interim analysis (even if all assumptions are correct)?
    • How could a dataset of a trial that is stopped early for efficacy at an interim analysis look like?
  • Simulation is also possible for scenarios that are analytically intractable such as scenarios with delayed treatment effects.

The syntax of function getSimulationSurvival is very similar to function getSampleSizeSurvival. Hence, this document only provides some examples and expects that the reader is familiar with the R markdown document which describes standard designs of a trial with a survival endpoint.

getSimulationSurvival also supports the usage of adaptive sample size recalculation but this is not covered here. For more details, please also consult the help ?getSimulationSurvival.

2 Standard analytical calculation

For comparison with the simulation-based analysis, a standard example is first calculated under the following assumptions:

  • Group-sequential design with one interim analysis after 66% of information using an O’Brien&Fleming type \(\alpha\)-spending function, one-sided type I error 2.5%, power 80%:
  • Exponential PFS with a median PFS of 60 months in control (lambda2=log(2)/60) and a target hazard ratio of 0.74 (hazardRatio=0.74).
  • Annual drop-out of 2.5% in both arms (dropoutRate1 = 0.025, dropoutRate2 = 0.025, dropoutTime = 12).
  • Recruitment is 42 patients/month from month 6 onwards after linear ramp up. (accrualTime = c(0,1,2,3,4,5,6), accrualIntensity=c(6,12,18,24,30,36,42))
  • Randomization ratio 1:1 (allocation1=1 and allocation2=1; this is how subjects are randomized in treatment groups 1 and 2 in a subsequent way). 1:1 allocation is the default and is thus not explicitly set in the function call below.
  • A fixed total sample size of 1200 (maxNumberOfSubjects = 1200).

As described in the R markdown doument which describes standard designs of a trial with a survival endpoint, sample size calculations for this design can be performed as per the code below:

## Design plan parameters and output for survival data:
## 
## Design parameters:
##   Significance level                           : 0.0250 
##   Type II error rate                           : 0.2 
##   Test                                         : one-sided 
## 
## User defined parameters:
##   Lambda (2)                                   : 0.0116 
##   Hazard ratio                                 : 0.740 
##   Event time                                   : 12 
##   Accrual time                                 : 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 31.57 
##   Accrual intensity                            : 6.0, 12.0, 18.0, 24.0, 30.0, 36.0, 42.0 
##   Drop-out rate (1)                            : 0.025 
##   Drop-out rate (2)                            : 0.025 
## 
## Default parameters:
##   Type of computation                          : Schoenfeld 
##   Theta H0                                     : 1 
##   Planned allocation ratio                     : 1 
##   Piecewise survival times                     : 0.00 
##   Drop-out time                                : 12.00 
## 
## Sample size and output:
##   Direction upper                              : FALSE 
##   pi (1)                                       : 0.0975 
##   pi (2)                                       : 0.129 
##   Median (1)                                   : 81.1 
##   Median (2)                                   : 60.0 
##   Lambda (1)                                   : 0.00855 
##   Maximum number of subjects                   : 1200.0 
##   Maximum number of subjects (1)               : 600.0 
##   Maximum number of subjects (2)               : 600.0 
##   Maximum number of events                     : 350.5 
##   Total accrual time                           : 31.57 
##   Follow up time                               : 22.08 
##   Calculate follow up time                     : TRUE 
##   Information rates [1]                        : 0.660 
##   Information rates [2]                        : 1.000 
##   Analysis times [1]                           : 39.54 
##   Analysis times [2]                           : 53.65 
##   Expected study duration under H1             : 47.91 
##   Maximal study duration                       : 53.65 
##   Number of events by stage [1]                : 231.3 
##   Number of events by stage [2]                : 350.5 
##   Expected number of events under H0           : 349.8 
##   Expected number of events under H0/H1        : 340.5 
##   Expected number of events under H1           : 302.0 
##   Number of subjects [1]                       : 1200.0 
##   Number of subjects [2]                       : 1200.0 
##   Expected number of subjects under H1         : 1200.0 
##   Reject per stage [1]                         : 0.407 
##   Reject per stage [2]                         : 0.393 
##   Early stop                                   : 0.407 
##   Critical values (effect scale) [1]           : 0.718 
##   Critical values (effect scale) [2]           : 0.808 
##   Local one-sided significance levels [1]      : 0.005798 
##   Local one-sided significance levels [2]      : 0.023210 
## 
## Legend:
##   (i): values of treatment arm i
##   [k]: values at stage k

By design, the power of the trial is 80%. The interim analysis is after 232 events which is expected to occur after 39.54 months, and the final analysis is after 351 events which is expected to occur after 53.65 months. These numbers will now be compared to simulations.

3 Simulation under proportional hazards

The call getSimulationSurvival uses the same arguments as getSampleSizeSurvival with the following changes:

  • The maximum number of patients (maxNumberOfSubjects = 1200) is always provided to allow the simulation.
  • The number of events at each analysis is specified as per the analytical calculation above (plannedEvents=c(232,351)).
  • For one-sided tests, the direction of the alternative is specified. Here, the alternative is towards hazard ratios <1 which is specified as directionUpper = FALSE.
  • The number of simulated trials is specified (maxNumberOfIterations = 10000 in the example below).
  • By default, raw datasets from simulation runs are not extracted. However, in this example, it is specifies that one raw dataset that led to stopping after each stage, respectively, will be stored: maxNumberOfRawDatasetsPerStage=1.
  • For reproducibility, it is useful to set the random seed which is set to seed=2 in the example.
## Simulation of survival data (group sequential design):
## 
## User defined parameters:
##   Maximum number of subjects                   : 1200.0 
##   Accrual time                                 : 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 31.57 
##   Accrual intensity                            : 6.0, 12.0, 18.0, 24.0, 30.0, 36.0, 42.0 
##   Planned events                               : 232, 351 
##   Direction upper                              : FALSE 
##   Drop-out rate (1)                            : 0.025 
##   Drop-out rate (2)                            : 0.025 
##   Maximum number of iterations                 : 10000 
##   Lambda (2)                                   : 0.0116 
##   Hazard ratio                                 : 0.740 
##   Seed                                         : 2 
## 
## Default parameters:
##   Drop-out time                                : 12.00 
##   Event time                                   : 12 
##   Theta H0                                     : 1 
##   Allocation 1                                 : 1 
##   Allocation 2                                 : 1 
##   Conditional power                            : NA 
##   Kappa                                        : 1 
## 
## Results:
##   pi (1)                                       : 0.0975 
##   pi (2)                                       : 0.129 
##   Median (1)                                   : 81.1 
##   Median (2)                                   : 60.0 
##   Lambda (1)                                   : 0.00855 
##   Iterations [1]                               : 10000 
##   Iterations [2]                               : 5984 
##   Analysis times [1]                           : 39.61 
##   Analysis times [2]                           : 53.68 
##   Expected study duration                      : 48.05 
##   Number of events by stage [1]                : 232.0 
##   Number of events by stage [2]                : 351.0 
##   Expected number of events                    : 303.2 
##   Events not achieved [1]                      : 0 
##   Events not achieved [2]                      : 0 
##   Number of subjects [1]                       : 1200.0 
##   Number of subjects [2]                       : 1200.0 
##   Expected number of subjects                  : 1200.0 
##   Reject per stage [1]                         : 0.402 
##   Reject per stage [2]                         : 0.393 
##   Overall reject                               : 0.794 
##   Futility stop per stage                      : 0 
##   Futility stop                                : 0 
##   Early stop                                   : 0.402 
##   Cond. power (achieved) [1]                   : NA 
##   Cond. power (achieved) [2]                   : 0.5413 
## 
## Simulated data:
##   Analysis times [1]                           : median [range]: 39.594 [34.791 - 45.392]; mean +/-sd: 39.612 +/-1.448 
##   Analysis times [2]                           : median [range]: 53.649 [45.574 - 62.554]; mean +/-sd: 53.68 +/-2.031 
##   Number of subjects [1]                       : median [range]: 1200 [1200 - 1200]; mean +/-sd: 1200 +/-0 
##   Number of subjects [2]                       : median [range]: 1200 [1200 - 1200]; mean +/-sd: 1200 +/-0 
##   Observed # events by stage (1) [1]           : median [range]: 101 [76 - 126]; mean +/-sd: 100.849 +/-6.707 
##   Observed # events by stage (1) [2]           : median [range]: 157 [135 - 184]; mean +/-sd: 157.697 +/-6.394 
##   Observed # events by stage (2) [1]           : median [range]: 131 [106 - 156]; mean +/-sd: 131.151 +/-6.707 
##   Observed # events by stage (2) [2]           : median [range]: 194 [167 - 216]; mean +/-sd: 193.303 +/-6.394 
##   Number of events by stage [1]                : median [range]: 232 [232 - 232]; mean +/-sd: 232 +/-0 
##   Number of events by stage [2]                : median [range]: 351 [351 - 351]; mean +/-sd: 351 +/-0 
##   Test statistic [1]                           : median [range]: 2.27 [-1.39 - 5.967]; mean +/-sd: 2.267 +/-0.992 
##   Test statistic [2]                           : median [range]: 2.31 [-1.002 - 4.98]; mean +/-sd: 2.277 +/-0.782 
##   Log-rank statistic [1]                       : median [range]: 2.27 [-1.39 - 5.967]; mean +/-sd: 2.267 +/-0.992 
##   Log-rank statistic [2]                       : median [range]: 2.31 [-1.002 - 4.98]; mean +/-sd: 2.277 +/-0.782 
##   Hazard ratio estimate LR [1]                 : median [range]: 0.742 [0.457 - 1.2]; mean +/-sd: 0.749 +/-0.098 
##   Hazard ratio estimate LR [2]                 : median [range]: 0.781 [0.588 - 1.113]; mean +/-sd: 0.787 +/-0.067 
##   Cond. power (achieved) [2]                   : median [range]: 0.605 [0 - 0.972]; mean +/-sd: 0.541 +/-0.338 
## 
## Legend:
##   (i): values of treatment arm i
##   [k]: values at stage k

According to the output, the simulated overall power is 79.4% and the probability to cross the efficacy boundary at the interim analysis is 40.2%. These are both within 1% of the analytical power.

The mean simulated analysis times are after 39.61 months for the interim analysis and after 53.68 for the final analysis. Both timings differ by <0.1 months from the analytical calculation (Difference analysis times = 0.07, 0.03).

The summary of the simulations above also provide median [range] and mean+/-sd of the trial results across the simulated datasets. Summary results for each trial can be obtained from the simulation object using function getData. Similarly, raw data from individual trials that were stopped at each stage can be obtained using function getRawData (if maxNumberOfRawDatasetsPerStage was set > 0). The format of these datasets is described in the help ?getSimulationSurvival and illustrated below.

3.1 Accessing trial summaries per stage for each simulation

##   iterationNumber stageNumber        pi1       pi2 hazardRatio
## 1               1           1 0.09749927 0.1294494        0.74
## 2               2           1 0.09749927 0.1294494        0.74
## 3               3           1 0.09749927 0.1294494        0.74
## 4               4           1 0.09749927 0.1294494        0.74
## 5               4           2 0.09749927 0.1294494        0.74
## 6               5           1 0.09749927 0.1294494        0.74
##   analysisTime numberOfSubjects eventsPerStage1 eventsPerStage2
## 1     41.90615             1200              98             134
## 2     39.81463             1200              95             137
## 3     41.29950             1200              94             138
## 4     41.41023             1200             105             127
## 5     56.21735             1200             161             190
## 6     40.72236             1200              83             149
##   eventsPerStage rejectPerStage eventsNotAchieved futilityPerStage
## 1            232              1                 0                0
## 2            232              1                 0                0
## 3            232              1                 0                0
## 4            232              0                 0                0
## 5            351              0                 0                0
## 6            232              1                 0                0
##   testStatistic logRankStatistic conditionalPowerAchieved trialStop
## 1      2.657082         2.657082                       NA      TRUE
## 2      3.088842         3.088842                       NA      TRUE
## 3      3.331170         3.331170                       NA      TRUE
## 4      1.502588         1.502588                       NA     FALSE
## 5      1.663869         1.663869                0.4028956      TRUE
## 6      4.848153         4.848153                       NA      TRUE
##   hazardRatioEstimateLR
## 1             0.7054694
## 2             0.6665868
## 3             0.6457105
## 4             0.8209448
## 5             0.8372593
## 6             0.5290916
## 
##     1     2 
## 10000  5984

3.2 Accessing trial summaries per stage for each simulation

## Call:
## coxph(formula = Surv(timeUnderObservation, event) ~ I(treatmentGroup == 
##     1), data = stage1rawData)
## 
##   n= 1200, number of events= 232 
## 
##                               coef exp(coef) se(coef)      z Pr(>|z|)   
## I(treatmentGroup == 1)TRUE -0.3514    0.7037   0.1329 -2.643  0.00821 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##                            exp(coef) exp(-coef) lower .95 upper .95
## I(treatmentGroup == 1)TRUE    0.7037      1.421    0.5423    0.9131
## 
## Concordance= 0.548  (se = 0.017 )
## Likelihood ratio test= 7.08  on 1 df,   p=0.008
## Wald test            = 6.99  on 1 df,   p=0.008
## Score (logrank) test = 7.06  on 1 df,   p=0.008

4 Simulation under non-proportional hazards

For the sake of illustration, assume that the treatment effect in the example above is delayed by 6 months, i.e., that the active treatment does not affect the hazard in the first 6 months but still reduces it by 0.74-fold from month 6 onwards. The code below determines the power of the design in this situation via simulation.

The code to specify a delayed treatment effect is similar to the simulation under proportional hazards except that now the survival function in each arm is specified via a piecewise constant exponential distribution: piecewiseSurvivalTime=c(0,6), lambda2=c(log(2)/60,log(2)/60), lambda1=c(log(2)/60,0.74*log(2)/60) (as always for rpact, “2” refers to the control arm which still has a constant hazard rate, i.e. an exponential distribution).

## Simulation of survival data (group sequential design):
## 
## User defined parameters:
##   Maximum number of subjects                   : 1200.0 
##   Accrual time                                 : 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 31.57 
##   Accrual intensity                            : 6.0, 12.0, 18.0, 24.0, 30.0, 36.0, 42.0 
##   Planned events                               : 232, 351 
##   Direction upper                              : FALSE 
##   Drop-out rate (1)                            : 0.025 
##   Drop-out rate (2)                            : 0.025 
##   Maximum number of iterations                 : 10000 
##   Piecewise survival times                     : 0.00, 6.00 
##   Lambda (1)                                   : 0.01155, 0.00855 
##   Lambda (2)                                   : 0.0116, 0.0116 
##   Hazard ratio                                 : 1.000, 0.740 
##   Seed                                         : 3 
## 
## Default parameters:
##   Drop-out time                                : 12.00 
##   Event time                                   : 12 
##   Theta H0                                     : 1 
##   Allocation 1                                 : 1 
##   Allocation 2                                 : 1 
##   Conditional power                            : NA 
##   Kappa                                        : 1 
## 
## Results:
##   Iterations [1]                               : 10000 
##   Iterations [2]                               : 8471 
##   Analysis times [1]                           : 38.65 
##   Analysis times [2]                           : 52.68 
##   Expected study duration                      : 50.54 
##   Number of events by stage [1]                : 232.0 
##   Number of events by stage [2]                : 351.0 
##   Expected number of events                    : 332.8 
##   Events not achieved [1]                      : 0 
##   Events not achieved [2]                      : 0 
##   Number of subjects [1]                       : 1200.0 
##   Number of subjects [2]                       : 1200.0 
##   Expected number of subjects                  : 1200.0 
##   Reject per stage [1]                         : 0.153 
##   Reject per stage [2]                         : 0.419 
##   Overall reject                               : 0.572 
##   Futility stop per stage                      : 0 
##   Futility stop                                : 0 
##   Early stop                                   : 0.153 
##   Cond. power (achieved) [1]                   : NA 
##   Cond. power (achieved) [2]                   : 0.3587 
## 
## Simulated data:
##   Analysis times [1]                           : median [range]: 38.612 [34.035 - 44.311]; mean +/-sd: 38.654 +/-1.462 
##   Analysis times [2]                           : median [range]: 52.603 [45.908 - 60.955]; mean +/-sd: 52.676 +/-1.99 
##   Number of subjects [1]                       : median [range]: 1200 [1200 - 1200]; mean +/-sd: 1200 +/-0 
##   Number of subjects [2]                       : median [range]: 1200 [1200 - 1200]; mean +/-sd: 1200 +/-0 
##   Observed # events by stage (1) [1]           : median [range]: 106 [79 - 131]; mean +/-sd: 105.776 +/-6.709 
##   Observed # events by stage (1) [2]           : median [range]: 160 [137 - 187]; mean +/-sd: 159.808 +/-6.987 
##   Observed # events by stage (2) [1]           : median [range]: 126 [101 - 153]; mean +/-sd: 126.224 +/-6.709 
##   Observed # events by stage (2) [2]           : median [range]: 191 [164 - 214]; mean +/-sd: 191.192 +/-6.988 
##   Number of events by stage [1]                : median [range]: 232 [232 - 232]; mean +/-sd: 232 +/-0 
##   Number of events by stage [2]                : median [range]: 351 [351 - 352]; mean +/-sd: 351 +/-0.015 
##   Test statistic [1]                           : median [range]: 1.494 [-2.302 - 5.356]; mean +/-sd: 1.492 +/-0.995 
##   Test statistic [2]                           : median [range]: 1.981 [-1.53 - 4.684]; mean +/-sd: 1.948 +/-0.873 
##   Log-rank statistic [1]                       : median [range]: 1.494 [-2.302 - 5.356]; mean +/-sd: 1.492 +/-0.995 
##   Log-rank statistic [2]                       : median [range]: 1.981 [-1.53 - 4.684]; mean +/-sd: 1.948 +/-0.873 
##   Hazard ratio estimate LR [1]                 : median [range]: 0.822 [0.495 - 1.353]; mean +/-sd: 0.829 +/-0.109 
##   Hazard ratio estimate LR [2]                 : median [range]: 0.809 [0.607 - 1.177]; mean +/-sd: 0.816 +/-0.077 
##   Cond. power (achieved) [2]                   : median [range]: 0.256 [0 - 0.972]; mean +/-sd: 0.359 +/-0.34 
## 
## Legend:
##   (i): values of treatment arm i
##   [k]: values at stage k

As can be seen from the output above, power drops substantially to 57.2%.

If one wants the trial to maintain a power of 80% despite the delayed treatment effect, the maximal number of events would need to be increased and some experimentation shows that one would need 498 events in this case as demonstrated in the simulation below. Note further that in this non-proportional hazards scenario, power does not only depend on the number of events but also on other assumptions, in particular on assumptions regarding the speed of recruitment.

## Simulation of survival data (group sequential design):
## 
## User defined parameters:
##   Maximum number of subjects                   : 1200.0 
##   Accrual time                                 : 1.00, 2.00, 3.00, 4.00, 5.00, 6.00, 31.57 
##   Accrual intensity                            : 6.0, 12.0, 18.0, 24.0, 30.0, 36.0, 42.0 
##   Planned events                               : 329, 498 
##   Direction upper                              : FALSE 
##   Drop-out rate (1)                            : 0.025 
##   Drop-out rate (2)                            : 0.025 
##   Maximum number of iterations                 : 10000 
##   Piecewise survival times                     : 0.00, 6.00 
##   Lambda (1)                                   : 0.01155, 0.00855 
##   Lambda (2)                                   : 0.0116, 0.0116 
##   Hazard ratio                                 : 1.000, 0.740 
##   Seed                                         : 4 
## 
## Default parameters:
##   Drop-out time                                : 12.00 
##   Event time                                   : 12 
##   Theta H0                                     : 1 
##   Allocation 1                                 : 1 
##   Allocation 2                                 : 1 
##   Conditional power                            : NA 
##   Kappa                                        : 1 
## 
## Results:
##   Iterations [1]                               : 10000 
##   Iterations [2]                               : 6765 
##   Analysis times [1]                           : 49.93 
##   Analysis times [2]                           : 74.20 
##   Expected study duration                      : 66.38 
##   Number of events by stage [1]                : 329.0 
##   Number of events by stage [2]                : 498.0 
##   Expected number of events                    : 443.3 
##   Events not achieved [1]                      : 0 
##   Events not achieved [2]                      : 0 
##   Number of subjects [1]                       : 1200.0 
##   Number of subjects [2]                       : 1200.0 
##   Expected number of subjects                  : 1200.0 
##   Reject per stage [1]                         : 0.324 
##   Reject per stage [2]                         : 0.478 
##   Overall reject                               : 0.801 
##   Futility stop per stage                      : 0 
##   Futility stop                                : 0 
##   Early stop                                   : 0.324 
##   Cond. power (achieved) [1]                   : NA 
##   Cond. power (achieved) [2]                   : 0.5006 
## 
## Simulated data:
##   Analysis times [1]                           : median [range]: 49.899 [43.199 - 56.841]; mean +/-sd: 49.93 +/-1.916 
##   Analysis times [2]                           : median [range]: 74.128 [63.222 - 85.232]; mean +/-sd: 74.198 +/-2.868 
##   Number of subjects [1]                       : median [range]: 1200 [1200 - 1200]; mean +/-sd: 1200 +/-0 
##   Number of subjects [2]                       : median [range]: 1200 [1200 - 1200]; mean +/-sd: 1200 +/-0 
##   Observed # events by stage (1) [1]           : median [range]: 148 [115 - 178]; mean +/-sd: 148.303 +/-7.641 
##   Observed # events by stage (1) [2]           : median [range]: 228 [203 - 260]; mean +/-sd: 228.078 +/-7.115 
##   Observed # events by stage (2) [1]           : median [range]: 181 [151 - 214]; mean +/-sd: 180.697 +/-7.641 
##   Observed # events by stage (2) [2]           : median [range]: 270 [238 - 295]; mean +/-sd: 269.922 +/-7.115 
##   Number of events by stage [1]                : median [range]: 329 [329 - 329]; mean +/-sd: 329 +/-0 
##   Number of events by stage [2]                : median [range]: 498 [498 - 498]; mean +/-sd: 498 +/-0 
##   Test statistic [1]                           : median [range]: 2.079 [-1.502 - 5.956]; mean +/-sd: 2.075 +/-0.995 
##   Test statistic [2]                           : median [range]: 2.431 [-0.877 - 4.957]; mean +/-sd: 2.396 +/-0.793 
##   Log-rank statistic [1]                       : median [range]: 2.079 [-1.502 - 5.956]; mean +/-sd: 2.075 +/-0.995 
##   Log-rank statistic [2]                       : median [range]: 2.431 [-0.877 - 4.957]; mean +/-sd: 2.396 +/-0.793 
##   Hazard ratio estimate LR [1]                 : median [range]: 0.795 [0.519 - 1.18]; mean +/-sd: 0.8 +/-0.088 
##   Hazard ratio estimate LR [2]                 : median [range]: 0.804 [0.641 - 1.082]; mean +/-sd: 0.809 +/-0.058 
##   Cond. power (achieved) [2]                   : median [range]: 0.537 [0 - 0.972]; mean +/-sd: 0.501 +/-0.347 
## 
## Legend:
##   (i): values of treatment arm i
##   [k]: values at stage k

System: rpact 2.0.1, R version 3.5.2 (2018-12-20), platform: x86_64-w64-mingw32

To cite package ‘rpact’ in publications use:

Gernot Wassmer and Friedrich Pahlke (2019). rpact: Confirmatory Adaptive Clinical Trial Design and Analysis. R package version 2.0.1. https://CRAN.R-project.org/package=rpact

 

Creative Commons License
This work by Marcel Wolbers, Gernot Wassmer and Friedrich Pahlke is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.