Issue 12196: Section: 8.3.3.1 (marte-ftf) Source: THALES (Mr. Eric Maes, eric.maes(at)fr.thalesgroup.com) Nature: Enhancement Severity: Minor Summary: Proposition to extend the probability distributions functions list with (used in AnyLogic) : - geometric (double p) The Geometric distribution is a discrete distribution bounded at 0 and unbounded on the high side. It is a special case of the Negative Binomial distribution. In particular, it is the direct discrete analog for the continuous Exponential distribution. The Geometric distribution has no history dependence, its probability at any value being independent of a shift along the axis. - laplace (double phi, double beta) The Laplace distribution, sometimes called the double exponential distribution, is an unbounded continuous distribution that has a very sharp central peak, located at theta. The distribution scales with phi. - chi squared (double nu, double min) The Chi Squared is a continuous distribution bounded on the lower side. Note that the Chi Squared distribution is a subset of the Gamma distribution with beta=2 and alpha=nµ/2. Like the Gamma distribution, it has three distinct regions. For nµ=2, the Chi Squared distribution reduces to the Exponential distribution, starting at a finite value at minimum x and decreasing monotonically thereafter. For nµ<2, the Chi Squared distribution tends to infinity at minimum x and decreases monotonically for increasing x. For nµ>2, the Chi Squared distribution is 0 at minimum x, peaks at a value that depends on nµ, decreasing monotonically thereafter. - rayleigh (double sigma) The Rayleigh distribution is a continuous distribution bounded on the lower side. It is a special case of the Weibull distribution with alpha =2 and beta/sqrt(2) =sigma. Because of the fixed shape parameter, the Rayleigh distribution does not change shape although it can be scaled. - weibull (double alpha, double beta, double min) The Weibull distribution is a continuous distribution bounded on the lower side. Because it provides one of the limiting distributions for extreme values, it is also referred to as the Frechet distribution and the Weibull-Gnedenko distribution. - logistic (double beta, double alpha) The Logistic distribution is an unbounded continuous distribution which is symmetrical about its mean [and shift parameter], alpha. The shape of the Logistic distribution is very much like the Normal distribution, except that the Logistic distribution has broader tails. - pareto (doubla alpha, double min) The Pareto distribution is a continuous distribution bounded on the lower side. It has a finite value at the minimum x and decreases monotonically for increasing x. A Pareto random variable is the exponential of an Exponential random variable, and possesses many of the same characteristics. - triangular (double min, double max, double mode) The Triangular distribution is often used when no or little data is available; it is rarely an accurate representation of a data set. However, it is employed as the functional form of regions for fuzzy logic due to its ease of use. - cauchy (doubla lambda, double theta) The Cauchy distribution is an unbounded continuous distribution that has a sharp central peak but significantly broad tails. The tails are much heavier than the tails of the Normal distribution. - beta (double p, double q, double min, double max) The Beta distribution is a continuous distribution that has both upper and lower finite bounds. Because many real situations can be bounded in this way, the Beta distribution can be used empirically to estimate the actual distribution before much data is available. Even when data is available, the Beta distribution should fit most data in a reasonable fashion, although it may not be the best fit. The Uniform distribution is a special case of the Beta distribution with p, q = 1. - lognormal (double mu, double sigma, double min) The Lognormal distribution is a continuous distribution bounded on the lower side. It is always 0 at minimum x, rising to a peak that depends on both mu and sigma, then decreasing monotonically for increasing x. - erlang (double beta, int m, double min) The Erlang distribution is a continuous distribution bounded on the lower side. It is a special case of the Gamma distribution where the parameter, m, is restricted to a positive integer. As such, the Erlang distribution has no region where F(x) tends to infinity at the minimum value of x [m<1], but does have a special case at m=1, where it reduces to the Exponential distribution. - negativeBinomial (double p, double n) The Negative Binomial distribution is a discrete distribution bounded on the low side at 0 and unbounded on the high side. The Negative Binomial distribution reduces to the Geometric Distribution for k = 1. The Negative Binomial distribution gives the total number of trials, x, to get k events (failures...), each with the constant probability, p, of occurring. - logarithmic (double beta) The Logarithmic distribution is a discrete distribution bounded by [1,...]. Theta is related to the sample size and the mean. - hypergeometric (int ss, int dn, int ps) The Hypergeometric distribution is a discrete distribution bounded by [0,s]. It describes the number of defects, x, in a sample of size s from a population of size N which has m total defects. The ratio of m/N = p is sometimes used rather than m to describe the probability of a defect. Note that defects may be interpreted as successes, in which case x is the number of failures until (s-x) successes. The sample is taken without replacement. Resolution: We need to be sure that the new distributions functions are really necessary in MARTE. Note that we do not attempt in MARTE to define all the existing distribution functions but the required in common practice. After evaluating some performance analysis and simulation tools, we consider that the following one are highly decided, and propose to include them in the MARTE library. geometric (real p). The Geometric distribution is a discrete distribution bounded at 0 and unbounded on the high side. - triangular (real min, real max, real mode). The Triangular distribution is often used when no or little data is available; it is rarely an accurate representation of a data set. - logarithmic (real theta). The Logarithmic distribution is a discrete distribution bounded by [1,...]. Theta is related to the sample size and the mean. Further distribution functions can be added at library level. Issue Dependency Warning: Note that this issue affects Issue 12561, which clarifies the mechanism to specify probability distribution expressions. Issue 12561 depends on this issue, but the reverse case is not true. Revised Text: In page 45, add the following probability distribution descriptions: • geometric (p: Real) The Geometric distribution is a discrete distribution bounded at 0 and unbounded on the high side. • triangular (min: Real, max: Real, mode: Real) The Triangular distribution is often used when no or little data is available; it is rarely an accurate representation of a data set. • logarithmic (theta: Real) The Logarithmic distribution is a discrete distribution bounded by [1,...]. Theta is related to the sample size and the mean. Actions taken: January 24, 2008: received issue October 16, 2009: closed issue Discussion: Resolution: This feature needs to be discussed more carefully. We need to be sure that the new distributions functions are really necessary in MARTE. Note that we do not attempt in MARTE to define all the existing distribution functions but the required in common practice. We propose to defer this issue. Note that this is only an enhancement and does not affect the consistency of the specification. Further distribution functions can be added at library level. Disposition: Deferred End of Annotations:===== m: webmaster@omg.org Date: 24 Jan 2008 09:57:23 -0500 To: Subject: Issue/Bug Report -------------------------------------------------------------------------------- Name: Eric MAES Company: Thales Research & Technology mailFrom: eric.maes@thalesgroup.com Notification: Yes Specification: Probability distributions Section: 8.3.3.1 FormalNumber: A UML Profile for MARTE Version: Beta 1 RevisionDate: 04/08/2007 Page: 44 Nature: Enhancement Severity: Minor HTTP User Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11 Description Proposition to extend the probability distributions functions list with (used in AnyLogic) : - geometric (double p) The Geometric distribution is a discrete distribution bounded at 0 and unbounded on the high side. It is a special case of the Negative Binomial distribution. In particular, it is the direct discrete analog for the continuous Exponential distribution. The Geometric distribution has no history dependence, its probability at any value being independent of a shift along the axis. - laplace (double phi, double beta) The Laplace distribution, sometimes called the double exponential distribution, is an unbounded continuous distribution that has a very sharp central peak, located at theta. The distribution scales with phi. - chi squared (double nu, double min) The Chi Squared is a continuous distribution bounded on the lower side. Note that the Chi Squared distribution is a subset of the Gamma distribution with beta=2 and alpha=nµ/2. Like the Gamma distribution, it has three distinct regions. For nµ=2, the Chi Squared distribution reduces to the Exponential distribution, starting at a finite value at minimum x and decreasing monotonically thereafter. For nµ<2, the Chi Squared distribution tends to infinity at minimum x and decreases monotonically for increasing x. For nµ>2, the Chi Squared distribution is 0 at minimum x, peaks at a value that depends on nµ, decreasing monotonically thereafter. - rayleigh (double sigma) The Rayleigh distribution is a continuous distribution bounded on the lower side. It is a special case of the Weibull distribution with alpha =2 and beta/sqrt(2) =sigma. Because of the fixed shape parameter, the Rayleigh distribution does not change shape although it can be scaled. - weibull (double alpha, double beta, double min) The Weibull distribution is a continuous distribution bounded on the lower side. Because it provides one of the limiting distributions for extreme values, it is also referred to as the Frechet distribution and the Weibull-Gnedenko distribution. - logistic (double beta, double alpha) The Logistic distribution is an unbounded continuous distribution which is symmetrical about its mean [and shift parameter], alpha. The shape of the Logistic distribution is very much like the Normal distribution, except that the Logistic distribution has broader tails. - pareto (doubla alpha, double min) The Pareto distribution is a continuous distribution bounded on the lower side. It has a finite value at the minimum x and decreases monotonically for increasing x. A Pareto random variable is the exponential of an Exponential random variable, and possesses many of the same characteristics. - triangular (double min, double max, double mode) The Triangular distribution is often used when no or little data is available; it is rarely an accurate representation of a data set. However, it is employed as the functional form of regions for fuzzy logic due to its ease of use. - cauchy (doubla lambda, double theta) The Cauchy distribution is an unbounded continuous distribution that has a sharp central peak but significantly broad tails. The tails are much heavier than the tails of the Normal distribution. - beta (double p, double q, double min, double max) The Beta distribution is a continuous distribution that has both upper and lower finite bounds. Because many real situations can be bounded in this way, the Beta distribution can be used empirically to estimate the actual distribution before much data is available. Even when data is available, the Beta distribution should fit most data in a reasonable fashion, although it may not be the best fit. The Uniform distribution is a special case of the Beta distribution with p, q = 1. - lognormal (double mu, double sigma, double min) The Lognormal distribution is a continuous distribution bounded on the lower side. It is always 0 at minimum x, rising to a peak that depends on both mu and sigma, then decreasing monotonically for increasing x. - erlang (double beta, int m, double min) The Erlang distribution is a continuous distribution bounded on the lower side. It is a special case of the Gamma distribution where the parameter, m, is restricted to a positive integer. As such, the Erlang distribution has no region where F(x) tends to infinity at the minimum value of x [m<1], but does have a special case at m=1, where it reduces to the Exponential distribution. - negativeBinomial (double p, double n) The Negative Binomial distribution is a discrete distribution bounded on the low side at 0 and unbounded on the high side. The Negative Binomial distribution reduces to the Geometric Distribution for k = 1. The Negative Binomial distribution gives the total number of trials, x, to get k events (failures...), each with the constant probability, p, of occurring. - logarithmic (double beta) The Logarithmic distribution is a discrete distribution bounded by [1,...]. Theta is related to the sample size and the mean. - hypergeometric (int ss, int dn, int ps) The Hypergeometric distribution is a discrete distribution bounded by [0,s]. It describes the number of defects, x, in a sample of size s from a population of size N which has m total defects. The ratio of m/N = p is sometimes used rather than m to describe the probability of a defect. Note that defects may be interpreted as successes, in which case x is the number of failures until (s-x) successes. The sample is taken without replacement. Date: Thu, 24 Apr 2008 17:53:51 -0400 (EDT) From: Murray Woodside Reply-To: cmw@sce.carleton.ca To: ESPINOZA Huascar 218344 Cc: Eric Maes , marte-ftf@omg.org Subject: Re: Issue 12196 and Distribution functions There is undoubtedly a need for more distributions, however there are many such lists to use, this one is not ideal and has many confusing definitions, maybe a reference to a standard textbook makes more sense. this is a perfect issue to defer to the RTF. Murray Woodside Distinguished Research Professor Dept of Systems and Computer Engineering, Carleton University, 1125 Colonel By Drive, Ottawa K1S 5B6, Canada. (613)-520-5721.....fax (613)-520-5727....cmw@sce.carleton.ca (http://www.sce.carleton.ca/faculty/woodside.html) On Wed, 23 Apr 2008, ESPINOZA Huascar 218344 wrote: Hi Murray, hi Eric, There is an issue in NFP asking for adding new distribution functions. This issue was posted by Eric from Thales, who is in copy to this email. I know that there are a lot of distribution functions that are not currently covered. However, the question is if we really need them in the MARTE spec., or these should be added in more specialized libraries (outside MARTE). The pragmatic question is if we need them for current analysis/simulation tool support. Eric, could you please let us know what is AnyLogic (it is in your issue description). I know that it is a simulation tool, but for which kind of aspects/systems? Thank you. Here below I copy the issue description (which is not formatted )... Regards, Huascar -- Huascar ESPINOZA, Ph.D. CEA LIST Model-Driven Engineering for Real-Time Embedded Systems 91191 GIF/YVETTE CEDEX Phone/Fax: +33 1 69 08 45 87 / 20 82 France --- OMG Issue No: 12196 Title: Section: 8.3.3.1 (New distribution functions) Source: THALES (Mr. Eric Maes, eric.maes@thalesgroup.com) Summary: Proposition to extend the probability distributions functions list with (used in AnyLogic) : - geometric (double p) The Geometric distribution is a discrete distribution bounded at 0 and unbounded on the high side. It is a special case of the Negative Binomial distribution. In particular, it is the direct discrete analog for the continuous Exponential distribution. The Geometric distribution has no history dependence, its probability at any value being independent of a shift along the axis. - laplace (double phi, double beta) The Laplace distribution, sometimes called the double exponential distribution, is an unbounded continuous distribution that has a very sharp central peak, located at theta. The distribution scales with phi. - chi squared (double nu, double min) The Chi Squared is a continuous distribution bounded on the lower side. Note that the Chi Squared distribution is a subset of the Gamma distribution with beta=2 and alpha=nµ/2. Like the Gamma distribution, it has three distinct regions. For nµ=2, the Chi Squared distribution reduces to the Exponential distribution, starting at a finite value at minimum x and decreasing monotonically thereafter. For nµ<2, the Chi Squared distribution tends to infinity at minimum x and decreases monotonically for increasing x. For nµ>2, the Chi Squared distribution is 0 at minimum x, peaks at a value that depends on nµ, decreasing monotonically thereafter. - rayleigh (double sigma) The Rayleigh distribution is a continuous distribution bounded on the lower side. It is a special case of the Weibull distribution with alpha =2 and beta/sqrt(2) =sigma. Because of the fixed shape parameter, the Rayleigh distribution does not change shape although it can be scaled. - weibull (double alpha, double beta, double min) The Weibull distribution is a continuous distribution bounded on the lower side. Because it provides one of the limiting distributions f.tnemecalper tuohtiw nekat si elpmas ehT .sesseccus )x-s( litnu seruliaf fo rebmun eht si x esac hcihw ni ,sesseccus sa deterpretni eb yam stcefed taht etoN .tcefed a fo ytilibaborp eht ebircsed ot m naht rehtar desu semitemos si p = N/m fo oitar ehT .stcefed latot m sah hcihw N ezis fo noitalupop a morf s ezis fo elpmas a ni ,x ,stcefed fo rebmun eht sebircsed tI .]s,0[ yb dednuob noitubirtsid etercsid a si noitubirtsid cirtemoegrepyH ehT )sp tni ,nd tni ,ss tni( cirtemoegrepyh - .naem eht dna ezis elpmas eht ot detaler si atehT .]...,1[ yb dednuob noitubirtsid etercsid a si noitubirtsid cimhtiragoL ehT )ateb elbuod( cimhtiragol - .gnirrucco fo ,p ,ytilibaborp tnatsnoc eht htiw hcae ,)...seruliaf( stneve k teg ot ,x ,slairt fo rebmun latot eht sevig noitubirtsid laimoniB evitageN ehT .1 = k rof noitubirtsiD cirtemoeG eht ot secuder noitubirtsid laimoniB evitageN ehT .edis hgih eht no dednuobnu dna 0 ta edis wol eht no dednuob noitubirtsid etercsid a si noitubirtsid laimoniB evitageN ehT )n elbuod ,p elbuod( laimoniBevitagen - .noitubirtsid laitnenopxE eht ot secuder ti erehw ,1=m ta esac laiceps a evah seod tub ,]1