Title: | Joint Distribution of Number of Crossings and Longest Run |
---|---|
Description: | Joint distribution of number of crossings and the longest run in a series of independent Bernoulli trials. The computations uses an iterative procedure where computations are based on results from shorter series. The procedure conditions on the start value and partitions by further conditioning on the position of the first crossing (or none). |
Authors: | Tore Wentzel-Larsen [aut, cre], Jacob Anhøj [aut] |
Maintainer: | Tore Wentzel-Larsen <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1 |
Built: | 2025-02-24 03:48:11 UTC |
Source: | https://github.com/torewentzel-larsen/crossrun |
A box cumulative sum is defined as the
cumulative sum over a lower left rectangle. This function
is primarily for use when the components are point
probabilities for the number of crossings C and the longest
run L, then component (c,l) in the result is the
probability .
boxprobt(mtrx)
boxprobt(mtrx)
mtrx |
mpfr array |
mpfr array
nill <- Rmpfr::mpfr(0, 120) one <- Rmpfr::mpfr(1, 120) two <- Rmpfr::mpfr(2, 120) contents <- c(one,nill,nill, one,one,one, two,two,two) mtrx3 <- Rmpfr::mpfr2array(contents, dim = c(3, 3)) print(mtrx3) print(boxprobt(mtrx3))
nill <- Rmpfr::mpfr(0, 120) one <- Rmpfr::mpfr(1, 120) two <- Rmpfr::mpfr(2, 120) contents <- c(one,nill,nill, one,one,one, two,two,two) mtrx3 <- Rmpfr::mpfr2array(contents, dim = c(3, 3)) print(mtrx3) print(boxprobt(mtrx3))
Auxiliary function for simclbin, computing the number of crossings (type=0) or longest run (type=2) in a sequence of independent normal observations. Crossings and runs are related to whether the observations are above a shift.
clshift(seri, shift = 0, type = 0)
clshift(seri, shift = 0, type = 0)
seri |
numeric; seri a sequence of random draws |
shift |
numeric; shift for the observatoobs |
type |
numeric; 0 number of crossings, 1 longest run |
number of crossings or longest run, numeric
Joint probability distribution for the number of crossings
C and the longest run L in a sequence of n autocorrelated Bernoulli
observations with success probability p. To enhance precision, results
are stored in mpfr arrays and the probabilities are multiplied by
for a multiplier m.
crossrunauto( nmax = 100, prob = 0.5, changeprob = 0.5, mult = 2, prec = 120, printn = FALSE )
crossrunauto( nmax = 100, prob = 0.5, changeprob = 0.5, mult = 2, prec = 120, printn = FALSE )
nmax |
max sequence length. |
prob |
success probability p. |
changeprob |
unrestricted change probability. If |
mult |
multiplier for joint probabilities. |
prec |
mpft precision. |
printn |
logical for progress output. |
list of joint probabilities.
# p=0.6, independence cr10.6 <- crossrunbin(nmax=10, prob=0.6, printn=TRUE) cra10.6 <- crossrunauto(nmax=10, prob=0.6, changeprob=.6, printn=TRUE) Rmpfr::asNumeric(cr10.6$pt[[10]]) Rmpfr::asNumeric(cr10.6$pt[[10]]) Rmpfr::asNumeric(cr10.6$pt[[10]]) - Rmpfr::asNumeric(cra10.6$pt[[10]]) # equal # p=0.6, some dependence cr10.6 <- crossrunbin(nmax=10, prob=0.6, printn=TRUE) cra10.6.u.5 <- crossrunauto(nmax=10, prob=0.6, changeprob=.5, printn=TRUE) round(Rmpfr::asNumeric(cr10.6$pt[[10]]),1) round(Rmpfr::asNumeric(cra10.6.u.5$pt[[10]]),1) # not the same
# p=0.6, independence cr10.6 <- crossrunbin(nmax=10, prob=0.6, printn=TRUE) cra10.6 <- crossrunauto(nmax=10, prob=0.6, changeprob=.6, printn=TRUE) Rmpfr::asNumeric(cr10.6$pt[[10]]) Rmpfr::asNumeric(cr10.6$pt[[10]]) Rmpfr::asNumeric(cr10.6$pt[[10]]) - Rmpfr::asNumeric(cra10.6$pt[[10]]) # equal # p=0.6, some dependence cr10.6 <- crossrunbin(nmax=10, prob=0.6, printn=TRUE) cra10.6.u.5 <- crossrunauto(nmax=10, prob=0.6, changeprob=.5, printn=TRUE) round(Rmpfr::asNumeric(cr10.6$pt[[10]]),1) round(Rmpfr::asNumeric(cra10.6.u.5$pt[[10]]),1) # not the same
Joint probability distribution for the number of crossings
C and the longest run L in a sequence of n independent Bernoulli observations
with success probability p. To enhance precision, results are stored
in mpfr arrays and the probabilities are multiplied by
for a multiplier m.
crossrunbin(nmax = 100, prob = 0.5, mult = 2, prec = 120, printn = FALSE)
crossrunbin(nmax = 100, prob = 0.5, mult = 2, prec = 120, printn = FALSE)
nmax |
max sequence length. |
prob |
success probability. |
mult |
multiplier for joint probabilities. |
prec |
mpft precision. |
printn |
logical for progress output. |
list of joint probabilities.
crb10.6 <- crossrunbin(nmax=10, prob=.6, printn=TRUE) print(crb10.6$pt[[10]])
crb10.6 <- crossrunbin(nmax=10, prob=.6, printn=TRUE) print(crb10.6$pt[[10]])
Joint probability distribution for the number of crossings
C and the longest run L in a sequence of n independent Bernoulli observations
with p ossibly varying success probability. To enhance precision, results are stored
in mpfr arrays and the probabilities are multiplied by
for a multiplier m.
crossrunchange( nmax = 100, prob = rep(0.5, 100), mult = 2, prec = 120, printn = FALSE )
crossrunchange( nmax = 100, prob = rep(0.5, 100), mult = 2, prec = 120, printn = FALSE )
nmax |
max sequence length. |
prob |
success probabilities. |
mult |
multiplier for joint probabilities. |
prec |
mpft precision. |
printn |
logical for progress output. |
list pt of joint probabilities. Cumulative probabilities qt within each row are also included. Further, mostly for code checking, lists pat and qat conditional on starting with a success, and pbt and qbt conditional of starting with a failure, are included.
prob10 <- c(rep(.5,5),rep(.7,5)) crchange10 <- crossrunchange(nmax=10, prob=prob10,printn=TRUE) print(crchange10$pt[[10]])
prob10 <- c(rep(.5,5),rep(.7,5)) crchange10 <- crossrunchange(nmax=10, prob=prob10,printn=TRUE) print(crchange10$pt[[10]])
Joint probability distribution for the number of crossings C and the longest run L in a sequence of n Bernoulli observations where the number of successes is fixed at m, m between 0 and n. For fixed n, the joint distribution is computed for all m, this makes the computation demanding in terms of time and storage requirements. The joint distribution is computed separately for sequences where the first observation is, or is not, a success. The results are mainly intended for use when n is even and m=n/2, but computation in this case requires that all distributions are computed previously for all m, for all shorter sequences (lower n). In the case of even n and m=n/2, the distributions for sequences starting or not with a success are identical, and only the distribution among sequences starting with a success is used. In that case, this may be interpreted as the joint distribution for sequences around the empirical median.
crossrunem(nmax = 100, prec = 120, printn = FALSE)
crossrunem(nmax = 100, prec = 120, printn = FALSE)
nmax |
max sequence length. |
prec |
mpft precision. |
printn |
logical for progress output. |
nfi, number of sequences with m successes, starting with a success, and nfn, number of sequences with m successes, not starting with a success. Three-dimensional Rmpfr arrays for each n up to nmax, with dimensions n (C=0 to n-1), n (L=1 to n) and n+1 (m=0 to n). For n even and m=n/2, only nfi, and the part corresponding to C=1 to n-1 and L=1 and m=n/2 is non-zero and should be used.
crem14 <- crossrunem(nmax=14, printn=TRUE) Rmpfr::asNumeric(crem14$nfi[[14]][,,"m=7"]) # subsets of size 7=14/2 # restricted to possible values of C and L Rmpfr::asNumeric(crem14$nfi[[14]][-1,1:7,"m=7"]) # same as stored data joint14em Rmpfr::asNumeric(crem14$nfn[[14]][-1,1:7,"m=7"]) # the same # subsets of sizes different from 14/2 # size 4, first observation included Rmpfr::asNumeric(crem14$nfi[[14]][,,"m=4"]) # size 14-4=10, first observation not included Rmpfr::asNumeric(crem14$nfn[[14]][,,"m=10"]) # the same
crem14 <- crossrunem(nmax=14, printn=TRUE) Rmpfr::asNumeric(crem14$nfi[[14]][,,"m=7"]) # subsets of size 7=14/2 # restricted to possible values of C and L Rmpfr::asNumeric(crem14$nfi[[14]][-1,1:7,"m=7"]) # same as stored data joint14em Rmpfr::asNumeric(crem14$nfn[[14]][-1,1:7,"m=7"]) # the same # subsets of sizes different from 14/2 # size 4, first observation included Rmpfr::asNumeric(crem14$nfi[[14]][,,"m=4"]) # size 14-4=10, first observation not included Rmpfr::asNumeric(crem14$nfn[[14]][,,"m=10"]) # the same
Continuation of an existing sequence of the number of
crossings C and the longest run L in a sequence of n independent
continuous observations classified as above or below the empirical
median. To enhance precision, results are stored in mpfr arrays and
the probabilities are multiplied by where m=n/2,
even n assumed. The probabilities are integers in this representation.
crossrunemcont(emstart, n1 = 61, nmax = 100, prec = 120, printn = FALSE)
crossrunemcont(emstart, n1 = 61, nmax = 100, prec = 120, printn = FALSE)
emstart |
existing sequence |
n1 |
sequence length for the first new case addedc |
nmax |
max sequence length. |
prec |
mpft precision. |
printn |
logical for including progress output. |
nfi, number of sequences with m successes, starting with a success, and nfn, number of sequences with m successes, not starting with a success.
wrapper for crossrunbin, success probability=pnorm(shift).
crossrunshift(nmax = 100, shift = 0, mult = 2, prec = 120, printn = FALSE)
crossrunshift(nmax = 100, shift = 0, mult = 2, prec = 120, printn = FALSE)
nmax |
max sequence length. |
shift |
mean of normal distribution. |
mult |
multiplier for joint probabilities. |
prec |
mpft precision. |
printn |
logical for progress output. |
list pt of joint probabilities. Cumulative probabilities qt within each row are also included. Further, mostly for code checking, lists pat and qat conditional on starting with a success, and pbt and qbt conditional of starting with a failure, are included.
crs15 <- crossrunshift(nmax=15,printn=TRUE) print(crs15$pt[[15]])
crs15 <- crossrunshift(nmax=15,printn=TRUE) print(crs15$pt[[15]])
Joint probability distribution for the number of crossings
C and the longest run L in a sequence of n independent Bernoulli observations
with success probability p. To enhance precision, results are stored
in mpfr arrays and the probabilities are multiplied by
for a multiplier m. This is for the symmetric case with success
probability 0.5, in which the multiplied probabilities are
integers for the default value 2 of the multiplier.
crossrunsymm(nmax = 100, mult = 2, prec = 120, printn = FALSE)
crossrunsymm(nmax = 100, mult = 2, prec = 120, printn = FALSE)
nmax |
; max sequence length. |
mult |
; multiplier for joint probabilities. Default 2. |
prec |
; mpft precision. |
printn |
; logical for including progress output. |
pt, list of joint probabilities, multiplied with .
In addition cumulative probabilities qt within each row are also included.
crs10 <- crossrunsymm(nmax=10,printn=TRUE)
crs10 <- crossrunsymm(nmax=10,printn=TRUE)
Row-wise Cumulative Sums in mpfr Array.
cumsumm(mtrx)
cumsumm(mtrx)
mtrx |
mpfr two-dimensional array. |
mpfr array with row-wise cumulative sums, same dimension as the original array.
nill <- Rmpfr::mpfr(0, 120) one <- Rmpfr::mpfr(1, 120) two <- Rmpfr::mpfr(2, 120) contents <- c(one,nill,nill, one,one,one, two,two,two) mtrx3 <- Rmpfr::mpfr2array(contents, dim = c(3, 3)) print(mtrx3) print(cumsumm(mtrx3))
nill <- Rmpfr::mpfr(0, 120) one <- Rmpfr::mpfr(1, 120) two <- Rmpfr::mpfr(2, 120) contents <- c(one,nill,nill, one,one,one, two,two,two) mtrx3 <- Rmpfr::mpfr2array(contents, dim = c(3, 3)) print(mtrx3) print(cumsumm(mtrx3))
Column-wise cumulative sums in mpfr array.
cumsummcol(mtrx)
cumsummcol(mtrx)
mtrx |
mpfr two-dimensional array. |
mpfr array with column-wise cumulative sums, same dimension as the original array.
nill <- Rmpfr::mpfr(0, 120) one <- Rmpfr::mpfr(1, 120) two <- Rmpfr::mpfr(2, 120) contents <- c(one,nill,nill, one,one,one, two,two,two) mtrx3 <- Rmpfr::mpfr2array(contents, dim = c(3, 3)) print(mtrx3) print(cumsummcol(mtrx3))
nill <- Rmpfr::mpfr(0, 120) one <- Rmpfr::mpfr(1, 120) two <- Rmpfr::mpfr(2, 120) contents <- c(one,nill,nill, one,one,one, two,two,two) mtrx3 <- Rmpfr::mpfr2array(contents, dim = c(3, 3)) print(mtrx3) print(cumsummcol(mtrx3))
Exact joint probabilities, for low n,
of the number of crossings C and the longest run L
in n independent Bernoulli observations with success
probability p. Probabilites are multiplied by .
exactbin(n, p = 0.5, prec = 120)
exactbin(n, p = 0.5, prec = 120)
n |
number, length of seqience, at most 6. |
p |
success probability. |
prec |
precision in mpfr calculations. Default 120. |
mpfr array
exactbin(n=6) exactbin(n=5, p=0.6)
exactbin(n=6) exactbin(n=5, p=0.6)
The joint probabilities of the number C og crossings
(0, ... 99) and the longest run L (1, ..., 100) in a
series of n=100 independent Bernoulli observations for
success probability 0.6. The probabilities are stored
in the "times" representations, multiplied by
. Only the joint distributions for
n=15, 60, 100 and success probabilities 0.5 and 0.6 are
included in the package to avoid excessive storage, but
many more cases may be generated by the function crossrunbin.
joint100.6
joint100.6
matrix, 100 rows and 100 columns
generated by the function crossrunbin and transformed from an Rmpfr array to a matrix
The joint probabilities of the number C og crossings
(0, ... 99) and the longest run L (1, ..., 100) in a
series of n=100 independent Bernoulli observations for the
symmetric case (success probability 0.5). The probabilities
are stored in the "times" representations, multiplied by
and are integers in the symmetric
case. Only the joint distributions for n=15, 60, 100
and success probabilities 0.5 and 0.6 are included in
the package to avoid excessive storage, but many more
cases may be generated by the function crossrunsymm.
joint100symm
joint100symm
matrix, 100 rows and 100 columns
generated by the function crossrunsymm and transformed from an Rmpfr array to a matrix
The joint probabilities of the number C og crossings
(0, ... 13) and the longest run L (1, ..., 14) in a
series of n=14 independent Bernoulli observations for
success probability 0.6. The probabilities are stored
in the "times" representations, multiplied by
. Only the joint distributions for
n=14, 60, 100 and success probabilities 0.5 and 0.6 are
included in the package to avoid excessive storage, but
many more cases may be generated by the function crossrunbin.
joint14.6
joint14.6
matrix, 14 rows and 14 columns
generated by the function crossrunbin and transformed from an Rmpfr array to a matrix
Joint probabilities of the number C of crossings (1, ... 13) and the longest run L (1, ..., 17) in a series of n=60 Bernoulli observations around its empirical median. The probabilities are stored in the "times" representations, multiplied by (60 by 30)/2, the number of constellations starting above the median, and are integers. About the empirical median there is at least one crossing, and the longest run cannot exceed 14/2=7. Only the joint distributions for n=14, 60 are included in the package to avoid excessive storage, but many more cases may be generated by the function 'crossrunem. Since these computations are demanding in terms of storage and computation time, they are at present not performed for n much above 60.
joint14em
joint14em
matrix, 13 rows and 7 columns
generated by the function crossrunsymm and transformed from an Rmpfr array to a matrix
Joint probabilities of the number C of crossings
(0, ... 13) and the longest run L (1, ..., 14) in a
series of n=14 independent Bernoulli observations for the
symmetric case (success probability 0.5). The probabilities
are stored in the "times" representations, multiplied by
and are integers in the symmetric
case. Only the joint distributions for n=14, 60, 100
and success probabilities 0.5 and 0.6 are included in
the package to avoid excessive storage, but many more
cases may be generated by the function crossrunsymm.
joint14symm
joint14symm
matrix, 14 rows and 14 columns
generated by the function crossrunsymm and transformed from an Rmpfr array to a matrix
The joint probabilities of the number C og crossings
(0, ... 59) and the longest run L (1, ..., 60) in a
series of n=60 independent Bernoulli observations for
success probability 0.6. The probabilities are stored
in the "times" representations, multiplied by
. Only the joint distributions for
n=15, 60, 100 and success probabilities 0.5 and 0.6 are
included in the package to avoid excessive storage, but
many more cases are generated in the script crossrun1.R.
joint60.6
joint60.6
matrix, 60 rows and 60 columns
generated by the function crossrunbin and transformed from an Rmpfr array to a matrix
Joint probabilities of the number C of crossings (1, ... 59) and the longest run L (1, ..., 30) in a series of n=14 Bernoulli observations around its empirical median. The probabilities are stored in the "times" representations, multiplied by (14 by 7)/2=1716, the number of constellations starting above the median, and are integers. About the empirical median there is at least one crossing, and the longest runcannot exceed 60/2=30. Only the joint distributions for n=14, 60 are included in the package to avoid excessive storage, but many more cases may be generated by the function 'crossrunem. Since these computations are demanding in terms of storage and computation time, they are at present not performed for n much above 60. '#'
joint60em
joint60em
matrix, 59 rows and 30 columns
generated by the function crossrunem and transformed from an Rmpfr array to a matrix
The joint probabilities of the number C og crossings
(0, ... 59) and the longest run L (1, ..., 60) in a
series of n=60 independent Bernoulli observations for the
symmetric case (success probability 0.5). The probabilities
are stored in the "times" representations, multiplied by
and are integers in the symmetric
case. Only the joint distributions for n=15, 60, 100
and success probabilities 0.5 and 0.6 are included in
the package to avoid excessive storage, but many more
cases may be generated by the function crossrunsymm.
joint60symm
joint60symm
matrix, 60 rows and 60 columns
generated by the function crossrunsymm and transformed from an Rmpfr array to a matrix
Simulation of a sequence of independent Bernoulli Observations. To reduce the amount of random draws, each simulation is based on a sequence of standard normal variables, and whether each observation is above a shift defined by the binomial probabilities assumed.
simclbin(nser = 100, nsim = 1e+05, probs = c(0.5, 0.6, 0.7, 0.8, 0.9))
simclbin(nser = 100, nsim = 1e+05, probs = c(0.5, 0.6, 0.7, 0.8, 0.9))
nser |
length of sequence simulated |
nsim |
number of simulations |
probs |
binomial probabilites |
a data frame with the number of crossings and longest run for each probability. For instance the variables nc0.5 and lr0.5 are the number of crossings and the longest run for success probability 0.5. One row for each simulation.
cl30simbin <- simclbin(nser=30, nsim=100) mean(cl30simbin$nc0.5) # mean number of crossings, p=0.5 mean(cl30simbin$lr0.9) # mean longest run, p=0.9
cl30simbin <- simclbin(nser=30, nsim=100) mean(cl30simbin$nc0.5) # mean number of crossings, p=0.5 mean(cl30simbin$lr0.9) # mean longest run, p=0.9
Simulation of a sequence of n=2m observations around the median in the sequence. To be used for checking the results of crossrunem.
simclem(m1 = 7, nsim = 1e+05)
simclem(m1 = 7, nsim = 1e+05)
m1 |
half the sequence length |
nsim |
number of simulations |
data frame with cs, number of crossings and ls, longest run in the simulations.
simclem14 <- simclem(nsim=sum(joint14em)) print(table(simclem14)) # joint distributions in the simulations print(joint14em) # for comparison
simclem14 <- simclem(nsim=sum(joint14em)) print(table(simclem14)) # joint distributions in the simulations print(joint14em) # for comparison