## TROUG Day 2012: System Design for a Million TPS

Here are the slides (System Design for a Million TPS) for my talk in TROUG Day 2012. Many of you also asked for the name of the book I have mentioned. It was

Thank you for joining us in TROUG Day 2012.

## All Primary Keys are Unique, but Not All Unique Keys are Primary: Don’t Borrow Primary Keys

I have encountered several times that database designers borrow primary keys (The term *borrow* refers reusing a primary key defined in some context as your context’s primary key.) from other systems. A major reason for that is,maybe, related with the way today’s enterprise applications are written: They are not monolitic but rather modular (each party implements some part of the whole system and they communicate with each other using popular techniques. Such as web services, servlets, EJBs, etc.).

A module written before other dependents defines a primary key in its context and propagates it in order to identify an entity within its context. The problem starts at this point if the dependent applications adopt the primary key of the caller application and reuse it as its own primary key.

Here is a scenario which I will refer throughout the post:

- We have an infrastructure module
*I*defined to let other parties to create instances of services available in your company. - We have another hub module
*H*responsible with gathering requests from actual clients and enrich them before sending requests to*I.* - Both of those isolated systems have their own databases and they communicate via web services.
- Module
*H*maintains primary key*request#*for each service definition request given by the actual client in order to refer to a request whenever needed. - Database designers of
*H*and*I*sit together and since application*H*is written before*I*and since it has already defined a primary key within its context, designer of*H*recommends to reuse*request#*within the context of*I*at this time. In that way they will guarantee that whenever they refer to some*request#*they will mean the same thing and module*I*will become a natural extension of module*H*.

Since we clearly set our toy environment let’s now start to elaborate potential risks in the final decision *module H* and *module I* designers came.

## Primary Keys are updated

Although it deserves a deep discussion, we can simply say that people do update primary keys although they shouldn’t. Since there are very very few legitimate cases in which you can update primary keys, we can conclude with a high degree of confidence that:

If you need to update a primary key, it is an indication of poor database design.

But the question in here is that as the designer of module *I* how can you protect yourself from a possible *request#* update at module *H* site (In this context *protection* means no primary key changes at your site). If you share the primary key of module *H,* you simply can’t.

Bottom line is that *don’t borrow your primary keys from other databases/applications because what you borrow may be subject to change.*

## Limiting the variety of applications implicitly

During the database design, designer mind is usually busy with the details of functionality requirements of the module and makes a silent assumption that applications accessing his/her module will be constant in time.

However this is usually not true. When new modules start to access the module you implement, are you sure that they will be using the same primary key as the first module does ? If not the first work-around attempt by the designer is to do semantic overloading (Semantic Overloading refers to meaning two different things by using the same notation and this usually yields ambiguity) if possible (Due to incompatible types overloading may not be possible). If overloading is not possible they will try to clone majority of the data model and have two models stemmed from the same point but divided into two just because uncommon primary keys.

*Bottom line is that don’t borrow your primary keys from other databases/applications because what you borrow may not be valid for some other application.*

## It may even not be unique at all

The last but not the least assume the following scenario. Module H1, using* request#* as its primary key, was the only hub module accessing your *I* module. But another module group decide on implementing a new hub module H2 with different capabilities. Since they share the same capabilities with *H1* at the layer they communicate with module *I, *they simply clone that part from *H1*. But unfortunately they recreate the sequence generating *request#*s in H2’s database because its brand-new.

I believe you got the point. They even can not create a new record in your module due to primary key conflict because *(invoke#,moduleName)* tuple is the unique thing indeed.

*Bottom line is that don’t borrow your primary keys from other databases/applications because what you borrow as unique may not be unique in a later point in time.*

## Conclusion

I’ve tried to to explain the drawbacks of borrowing primary keys from other systems. As a rule of thumb (although there are few exceptions to that)

Never let some other application to generate primary keys for you and use an isolated database sequence to generate your primary key values.

One final thing to tie post content to its title is that you are free to define unique indexes on top of primary keys of other systems (or use them as a part of a composite unique index in your module) if they deliver them to you and you need to ensure their uniqueness in your module. But the point is that making them unique does not make them primary.

## Create your Own R Server on Oracle Linux

It is very common to have people running R on their individual PCs. One major problem is the hardware limitations of your PCs will inhibit you to deal with large volumes of data.

Moreover if you wish to use Oracle R Enterprise you need a database connectivity and for some platforms like Mac there is no client available yet. In this post you will find how you can install R in a centralised fashion so that any individual can access it via their favorite browser.

## Preparing Oracle Enterprise Linux

- Ensure that ol5_u6_base (or a further release) and el5_addons (ol5_addons is also ok) repos are enabled in /etc/yum.repos.d/public-yum-el5.repo file(by setting enabled flag to 1)
- Issue
*yum install R.x86_64*(Notice that R package is in*el5_addons*and other dependendents from*el5_addons*and*ol5_u6_base*) - Download 64-bit RStudio-Server by issuing
*wget http://download2.rstudio.org/rstudio-server-0.96.331-x86_64.rpm* - Install RStudio-Server by issuing
*sudo rpm -Uvh rstudio-server-0.96.331-x86_64.rpm* - Start a browser and go to http://<rstudio-servername>:8787

- Provide your linux authentication details
- RStudio-Server is ready to use

- For more details on R Studio Server configuration refer to Management and Configuration documentations

## VMware Tools for Oracle UEK

If you wish to use VMware Tools with Oracle UEK as the guest OS (you should use it in order to interact guest with the host, proper screen resolution and many other functionality) you need to perform some extra steps unlike some other Linux distros. Such as Ubuntu, which can be automatically configured by VMware

Here is the list of actions you should take for installation:

- Configure yum on your Linux installation
- Using
*VMware Virtual Machine*menu choose*Install VMware Tools*option which will mount a pseudo device. - Copy
*VMwareTools-8.8.4-730257.tar.gz*into*/tmp*and extract it using*tar -xzvf VMwareTools-8.8.4-730257.tar.gz* *yum install kernel-headers kernel-uek-devel**./vmware-install.pl*as root

You are done.

## Open World 2012 Session

The time came and it is Oracle Open World time once again. I fly San Francisco next week to join the largest technology event in the world. This will be the Complex Event Processing, Advanced Analytics and Data Warehousing year for me in Open World.

After spending a non-speaker year (2011), this year I will be presenting on Oracle Enterprise R and ODM by going over several use cases. Here are session details:

**What:**Database Data Mining: Practical Enterprise R and Oracle Advanced Analytics**When**: 1st of October 2012 16:45 – 17:45**Where**: Moscone West – 3016

Join me and learn that BI is not about filtering tables, rotating tables, combining and aggregating tables without knowing the actual reason of doing them 🙂

## Line of Sight (LoS) Analysis: Optimizing the Observers for Best Coverage (Part 4)

*N*observers on terrain such that visible region (as many

*green*points as possible by our convention) is maximized ?”.

We first define a pseudo code in order to find the optimal (Not guaranteed. Keep in mind that optimization problems are usually NP-complete by their nature) layout of *N* observers. For simplicity we will assume that all observers have the same height (7 units) which can be relaxed later.

We will implement a constructive way of finding optimal layout for N observers. Here is the pseudo code:

- Find the optimal layout for 1 observer and compute coverage ratio (best coverage for one observer)
- Add another random observer ((
*uniform(-8,8),*) and compute the coverage for those two observers (random one and the best observer from Step 1).*uniform(-8,8))* - If the new coverage is better than the coverage in Step 1, use this as the input of optimization solver
- Otherwise repeat Step 2 to find a better coverage.
- For number of observers greater than 2 apply the idea in Step 2 recursively.

There are some blur points in this pseudo code. We will define those before moving further with the implementation.

### Coverage

The very first thing to be defined is the coverage idea. As you will remember from second post, we have defined our 3D terrain by evaluating our *height* function over outer product of *x & y* values varying over *[-8,8]* with a step size of *0.1* units. We have *1681* different* (x,y)* tuples. Here is the definition of coverage based on our conventions:

**Coverage Ratio**is the ratio of points within LoS of a given observer/group of observers (at least one of the observers mark those set of points as*green*) to the total number of points (1681)

### How to Find Optimal Coordinates of Observers ?

Optimality is a very common word used in place of many different concepts in real life or engineering. Let me define it once more for our purpose:

**Optimization**is the process of searching for an*N-dimensional vector*using a*technique*to maximize/minimize*a function of that N-dimensional vector*.

Now let’s substitute three italic words of definition for our problem:

**N-dimensional vector**in our problem is the vector of first to components of observer dimensions. Such as,*(x1,y1,x2,y2,…,xn,yn).***Technique**to be used is the Nelder and Mead Technique (A version of it implemented in R).**Function**to be maximized is the coverage function which we have defined for a given set of observers.

### Implementation

Let’s start by defining the function to be optimized that is *coverage* of terrain for a given set of observers.

targetfunc<-function(observer){ m <- matrix(data=observer,ncol=2,byrow=TRUE) # Compute merged status of all observers mergedstatus <- rep("red",length(terrain$height)) for(oidx in seq(1:dim(m)[1])){ terrain$dist2observer <- distance(terrain, c(m[oidx,],7)) status <- LoS(terrain,c(m[oidx,],7),maxVisibleDistance) mergedstatus <- updatestatus(mergedstatus,status) } sum(mergedstatus=="green")/1681 }

*matrix* routine allows us to create a table of two columns(first two dimensions of observers) and *length(observer)/2 *rows*. *We have used the technique discussed in part 3 to compute merged status of observers. *sum(mergedstatus==”green”)* call is used to count number of *green* points on terrain with respect to observers.

Next is the computation of first input to be given to optimization solver. That’s because for any optimization technique starting point is critical. Without any formal definition we will use our pseudo code to choose a *“good starting point/vector”.*

n <- 2 baselineValue <- 0.541344 previousObserver <- c(1.15861411217711, 1.1499851362913) observers <- c(previousObserver,runif(2,-8,8)) while(targetfunc(observers) <= baselineValue){ observers <- c(previousObserver,runif(2,-8,8)) } print(observers)

Above code is an example to initialize *observers* vector for searching best 2 observer layout. It uses the best coverage ratio for 1 observer case (*54.1344%*) and adds a new random observer next to best observer found for single observer case.

Final point is the optimization solver which is very simple and totally handled by R

optim <- optim(observers, targetfunc, control=list(fnscale=-1,trace=5,REPORT=1))

First parameter is the initial value for input vector (prepared by previous code piece). Second parameter is the name of the function to be maximized. *optim* function is implemented to solve minimization problems by default. Setting *fnscale* attribute of *control* parameter turns it to a maximization problem solver.

Now we can combine all to have our final script

library(rgl) ################## # Functions ################## # 3D Terrain Function height <- function (point) { sin(point$x)+0.125*point$y*sin(2*point$x)+sin(point$y)+0.125*point$x*sin(2*point$y)+3 } # Linear Function linear <- function (px, observer, target) { v <- observer - target y <- ((px - observer[1])/v[1])*v[2]+observer[2] z <- ((px - observer[1])/v[1])*v[3]+observer[3] data.frame(x=px,y=y, z=z) } # Linear Function distance <- function (terrain, observer) { sqrt((terrain$x-observer[1])^2+(terrain$y-observer[2])^2+(terrain$height-observer[3])^2) } LoS <- function(terrain, observer, maxVisibleDistance){ status = c() for (i in seq(1:nrow(terrain))) { if (observer[1] == terrain$x[i] && observer[2] == terrain$y[i]){ if(observer[3] >= terrain$height[i]){ if (terrain$dist2observer[i] > maxVisibleDistance){ status <- c(status,"yellow") }else{ status <- c(status,"green") } }else{ status <- c(status,"red") } }else{ # All points on line line <- linear(seq(from=min(observer[1],terrain$x[i]), to=max(observer[1],terrain$x[i]), by=0.1), observer, c(terrain$x[i],terrain$y[i],terrain$height[i])) # Terrain Height h <- height(line) # LoS Analysis aboveTerrain <- round((line$z-h),2) >= 0.00 visible <- !is.element(FALSE,aboveTerrain) if (visible){ # Second Rule if(terrain$dist2observer[i] <= maxVisibleDistance){ status <- c(status,"green") }else{ status <- c(status,"yellow") } }else{ status <- c(status,"red") } } } status } updatestatus <- function(status1,status2){ mergedstatus<-c() for(i in seq(length(status1))){ if (status1[i] == "green" || status2[i] == "green"){ mergedstatus <- c(mergedstatus,"green") }else if (status1[i] == "yellow" || status2[i] == "yellow"){ mergedstatus <- c(mergedstatus,"yellow") } else{ mergedstatus <- c(mergedstatus,"red") } } mergedstatus } ################## # Input ################## # Max visible distance maxVisibleDistance = 8 # Generate points with a step size of 0.1 x <- seq(from=-8,to=8,by=0.4) xygrid <- expand.grid(x=x, y=x) terrain <- data.frame(xygrid, height=height(xygrid) ) targetfunc<-function(observer){ #print(observer) m <- matrix(data=observer,ncol=2,byrow=TRUE) # Compute merged status of all observers mergedstatus <- rep("red",length(terrain$height)) for(oidx in seq(1:dim(m)[1])){ terrain$dist2observer <- distance(terrain, c(m[oidx,],7)) status <- LoS(terrain,c(m[oidx,],7),maxVisibleDistance) mergedstatus <- updatestatus(mergedstatus,status) } sum(mergedstatus=="green")/1681 } n <- 3 baselineValue <- 0.541344 previousObserver <- c(-1.32661956044593, 2.18870625357827) # List of observers (x1,y1,z1,x2,y2,z2) observers <- c(previousObserver,runif(2,-8,8)) while(targetfunc(observers) <= baselineValue){ observers <- c(previousObserver,runif(2,-8,8)) } print(observers) optim <- optim(observers, targetfunc, control=list(fnscale=-1,trace=5,REPORT=1))

### Results

Here is the coverage ratio for different number of observers after optimization

You can test covering more than 98% of whole terrain is not trivial by using only 6 random observers but requires “careful” choice of their layout.

Finally let’s check step by step improvement in coverage as we add more optimal observers.

## Single Observer

## Two Observers

## Three Observers

## Four Observers

## Five Observers

## Six Observers

## Line of Sight (LoS) Analysis: Multiple Observers (Part 3)

In this part of my LoS Analysis series, I will try to extend 3D LoS analysis for multiple observers. Assume that you drop multiple observers into a terrain with the aim of covering it perfectly (100% *green*).

We will reuse R codes used in Part 2. However we need to add a simple code piece to be used to merge Line of Sight results of multiple observers. If a point on terrain is visible by any of the observers that means point is visible, if the point is visible but far from all observers that means point is out of LoS due to distance (marked with *yellow*), for all other conditions point on terrain is *red*. *updatestatus* function is implemented for this purpose.

library(rgl) ################## # Functions ################## # 3D Terrain Function height <- function (point) { sin(point$x)+0.125*point$y*sin(2*point$x)+sin(point$y)+0.125*point$x*sin(2*point$y)+3 } # Linear Function linear <- function (px, observer, target) { v <- observer - target y <- ((px - observer[1])/v[1])*v[2]+observer[2] z <- ((px - observer[1])/v[1])*v[3]+observer[3] data.frame(x=px,y=y, z=z) } # Linear Function distance <- function (terrain, observer) { sqrt((terrain$x-observer[1])^2+(terrain$y-observer[2])^2+(terrain$height-observer[3])^2) } LoS <- function(terrain, observer, maxVisibleDistance){ status = c() for (i in seq(1:nrow(terrain))) { if (observer[1] == terrain$x[i] && observer[2] == terrain$y[i]){ if(observer[3] >= terrain$height[i]){ if (terrain$dist2observer[i] > maxVisibleDistance){ status <- c(status,"yellow") }else{ status <- c(status,"green") } }else{ status <- c(status,"red") } }else{ # All points on line line <- linear(seq(from=min(observer[1],terrain$x[i]), to=max(observer[1],terrain$x[i]), by=0.1), observer, c(terrain$x[i],terrain$y[i],terrain$height[i])) # Terrain Height h <- height(line) # LoS Analysis aboveTerrain <- round((line$z-h),2) >= 0.00 visible <- !is.element(FALSE,aboveTerrain) if (visible){ # Second Rule if(terrain$dist2observer[i] <= maxVisibleDistance){ status <- c(status,"green") }else{ status <- c(status,"yellow") } }else{ status <- c(status,"red") } } } status } updatestatus <- function(status1,status2){ mergedstatus<-c() for(i in seq(length(status1))){ if (status1[i] == "green" || status2[i] == "green"){ mergedstatus <- c(mergedstatus,"green") }else if (status1[i] == "yellow" || status2[i] == "yellow"){ mergedstatus <- c(mergedstatus,"yellow") } else{ mergedstatus <- c(mergedstatus,"red") } } mergedstatus } ################## # Input ################## # Observer location #observers<-c(0,0, 6,1,1,6) # Max visible distance maxVisibleDistance = 8 # Generate points with a step size of 0.1 x <- seq(from=-8,to=8,by=0.4) xygrid <- expand.grid(x=x, y=x) terrain <- data.frame(xygrid, height=height(xygrid) ) # List of observers (x1,y1,z1,x2,y2,z2) observers <- c(runif(2,-8,8),6,runif(2,-8,8),6, runif(2,-8,8),6,runif(2,-8,8),6, runif(2,-8,8),6,runif(2,-8,8),6, runif(2,-8,8),6,runif(2,-8,8),6) m <- matrix(data=observers,ncol=3,byrow=TRUE) # Compute merged status of all observers mergedstatus <- rep("red",length(terrain$height)) for(oidx in seq(1:dim(m)[1])){ terrain$dist2observer <- distance(terrain, m[oidx,]) status <- LoS(terrain,m[oidx,],maxVisibleDistance) mergedstatus <- updatestatus(mergedstatus,status) } # Set merged status as the ultimate status terrain <- data.frame(terrain,status = mergedstatus) rgl.open() rgl.surface(x, x, matrix(data=terrain$height,nrow=length(x),ncol=length(x)), col=matrix(data=mergedstatus,nrow=length(x),ncol=length(x)) ) bg3d("gray") # Mark all observers for(oidx in seq(1:dim(m)[1])){ spheres3d(c(m[oidx,1]), c(m[oidx,3]), c(m[oidx,2]), radius=0.25, color="white" ) } rgl.viewpoint(-60,30)

# A Few Examples

Here are a few examples. All those observers are uniformly distributed over terrain using *runif* function

### Trivial Case: Single Observer

### Two Observers

### Four Observers

### Eight Observers

## Line of Sight (LoS) Analysis: 3D Terrain Analysis (Part 2)

In my previous post on LoS Analysis, I have tried to explain briefly the basics of LoS in two dimensional space. Obviously real life problems are based on three dimensional terrains although basic concepts are all the same. In this second part I will try to adapt the same techniques with a few modifications for three dimensional terrains.

### 3D Terrain Visualization with R

One of the first differences in 3D LoS analysis is the terrain visualization. We can not use *plot* function for proper visualization is 3D. Fortunately R has all packages you need for any type of problem. I will use rgl package which can be downloaded using `install.packages("rgl")`

command.

Once you have the *rgl* package, generating pseudo 3D terrains as we did for 2D is a trivial thing.

You can use the following R script to generate your 3D terrains like above.

library(rgl) # 3D Terrain Function height <- function (x,y) { sin(x)+0.125*y*sin(2*x)+sin(y)+0.125*x*sin(2*y)+0.25 } # Terrain boundaries -8<=x<=8 and -8<=y<=8 boundary <- c(-8,8) # Terrain grid with a step size of 0.1 units xy<-seq(from=boundary[1],to=boundary[2],by=0.1) # Evaluate all heights for all grid points z<-outer(xy,xy,height) # A few visualization staff zlim <- range(z) zlen <- zlim[2] - zlim[1] + 1 colorlut <- terrain.colors(zlen) # height color lookup table col <- colorlut[ z-zlim[1]+1 ] # assign colors to heights for each point # Draw the terrain rgl.open() bg3d("gray") rgl.surface(xy, xy, z, color=col)

A new function in this script is *outer* function which generates the product of a vector and a *row-vector* to have a matrix (product of a *row-vector* with a *vector/column-vector* is obviously a scalar value and named to be *dot/inner product*). The third parameter of the function provides us the mechanism to apply a given function (*height* in our case) for each element of this matrix. Obviously you can play with *height *function to have fancier 3D terrains and to have best visualization you may need *viewpoint* routine in rgl package .

### LoS in 3D Terrain

Line of Sight analysis on 3D terrain uses the same principles as it does in 2D. Use the following R script to decide on status of a point (invisible, visible, visible but far away)

library(rgl) ################## # Functions ################## # 3D Terrain Function height <- function (x,y) { sin(x)+0.125*y*sin(2*x)+sin(y)+0.125*x*sin(2*y)+0.25 } # Linear Function linear <- function (x, observer, target) { v <- observer - target y <- ((x - observer[1])/v[1])*v[2]+observer[2] z <- ((x - observer[1])/v[1])*v[3]+observer[3] data.frame(x=x,y=y, z=z) } # Linear Function distance <- function (p0,p1) { sqrt(sum((p0-p1)^2)) } ################## # Input ################## # Observer location observer<-c(10,10,1) # Target on terrain target <- c(5, 5, height(5,5)) # Max visible distance maxVisibleDistance = 4 # Generate points with a step size of 0.1 x <- seq(from=min(observer[1],target[1]), to=max(observer[1],target[1]), by=0.1) # All points on line line <- linear(x, observer, target) # Terrain Height h <- height(line$x,line$y) # LoS Analysis aboveTerrain <- round((line$z-h),2) >= 0.1 # First Rule visible <- !is.element(FALSE,aboveTerrain) if (visible){ # Second Rule d <- distance(observer, target) if(d <= maxVisibleDistance){ status <- "LoS" }else{ status <- "non-LoS due to Distance" } }else{ status <- "non-LoS due to Blocking" }

Obviously there are a few changes in the script with compared to 2D version. The first one is *linear* function(Code Lines 10-18). New version not only evaluates second (*y*) but also the third dimension (*z*). Notice that *z* is our height dimension by convention. We have also utilized *data.frame* function to concatenate all dimensions to form a table of point dimensions

The second difference is on *height* function (Code Lines 5-8). It is no longer a mapping from *x* to *y* but a mapping from* x,y* to *z.*

Rest of the 3D version of script is pretty much the same or trivial to discuss more.

### Visualizing LoS on 3D Terrain

Until this point we have analyzed LoS of a single point on 2D-3D terrains. But usually network analists wish to know LoS map of the terrain with respect to a given observer. In other words we need to visually understand which regions on 3D terrain are visible by the *observer*, invisible by the *observer* due to blocking, or further than the limit from the *observer*.

Here the LoS map of our pseudo 3D terrain with respect to an observer with a given set of coordinates and maximum service range(*green* vs *yellow* regions).

You can obtain this visualization using following R script.

library(rgl) ################## # Functions ################## # 3D Terrain Function height <- function (point) { sin(point$x)+0.125*point$y*sin(2*point$x)+sin(point$y)+0.125*point$x*sin(2*point$y)+3 } # Linear Function linear <- function (px, observer, target) { v <- observer - target y <- ((px - observer[1])/v[1])*v[2]+observer[2] z <- ((px - observer[1])/v[1])*v[3]+observer[3] data.frame(x=px,y=y, z=z) } # Linear Function distance <- function (terrain, observer) { sqrt((terrain$x-observer[1])^2+(terrain$y-observer[2])^2+(terrain$height-observer[3])^2) } LoS <- function(terrain, observer, maxVisibleDistance){ status = c() for (i in seq(1:nrow(terrain))) { if (observer[1] == terrain$x[i] && observer[2] == terrain$y[i]){ if(observer[3] >= terrain$height[i]){ if (terrain$dist2observer[i] > maxVisibleDistance){ status <- c(status,"yellow") }else{ status <- c(status,"green") } }else{ status <- c(status,"red") } }else{ # All points on line line <- linear(seq(from=min(observer[1],terrain$x[i]), to=max(observer[1],terrain$x[i]), by=0.1), observer, c(terrain$x[i],terrain$y[i],terrain$height[i])) # Terrain Height h <- height(line) # LoS Analysis aboveTerrain <- round((line$z-h),2) >= 0.00 visible <- !is.element(FALSE,aboveTerrain) if (visible){ # Second Rule if(terrain$dist2observer[i] <= maxVisibleDistance){ status <- c(status,"green") }else{ status <- c(status,"yellow") } }else{ status <- c(status,"red") } } } status } ################## # Input ################## # Observer location observer<-c(0.835597146302462, -1.71025141328573, 6) # Max visible distance maxVisibleDistance = 8 # Generate points with a step size of 0.1 x <- seq(from=-8,to=8,by=0.4) xygrid <- expand.grid(x=x, y=x) terrain <- data.frame(xygrid, height=height(xygrid) ) terrain <- data.frame(terrain, dist2observer=distance(terrain, observer) ) terrain <- data.frame(terrain, status = LoS(terrain, observer, maxVisibleDistance)) rgl.open() rgl.surface(x, x, matrix(data=terrain$height,nrow=length(x),ncol=length(x)), col=matrix(data=terrain$status,nrow=length(x),ncol=length(x)) ) bg3d("gray") # Mark the observer spheres3d(c(observer[1]), c(observer[3]), c(observer[2]), radius=0.5, color="white" ) rgl.viewpoint(-60,30)

For a better visualization R allows you to implement spinning 3D terrains using *play3d* function and record it in gif format using *movie3d* function as I did below.

## Line of Sight (LoS) Analysis: Basics (Part 1)

# Introduction

Line of Sight analysis is a commonly used technique in telecommunication industry for A/I (Air Interface) equipment planning and allocation. With the simplest terms LoS is the question whether a point on N-dimensional space is visible by an other observer point. The question can be used to answer where to locate a transceiver on terrain so that it can serve customers on some region A.

Before relatively more complicated problems, let’s start with an easy example focusing on two dimensional terrains. Throughout the post, we will use R for coding which is my favorite option for any mathematical problem (statistics, plotting, linear algebra, optimization, etc.). But you can easily adapt coding material to Mathlab, Python,or your favorite language.

We will start by defining a mathematical function to be used to generate our pseudo terrains. For this purpose trigonometric functions (*sin*, *cos*) and polynomial functions are the best ones because of their wavy shapes. Here is an example of trigonometric terrain

**Figure 1 Trigonometric Terrain
**

In order to generate this two dimensional one use the following code piece

x <- seq(from=4,to=10,by=0.01) y <- sin(x)+cos(2*x)+sin(3*x)+cos(4*x)+3 windows() plot(x,y,'l', main="y=sin(x)+cos(2x)+sin(3x)+cos(4x)+3", ylab="height",col="blue")

**Figure 2 Polynomial Terrain
**

To obtain this terrain, use the following R script piece

x <- seq(from=0,to=6,by=0.01) y <- x*(x-1)*(x-2)*(x-3)*(x-4)*(x-5)*(x-6)+100 windows() plot(x,y,'l', main="y=x(x-1)(x-2)(x-3)(x-4)(x-5)(x-6)+100", ylab="height",col="blue")

Combining polynomial terrain functions with trigonometric ones will give you fancier ones.

# What is LoS ?

You can think that we have already answered this question but this was an informal try which is not very useful for solving the problem. In order to solve this problem methodically we need to understand what makes a *target* visible (within LoS) by the *observer*.

As you see on *Figure 3*, **green** point is within line of sight of *observer* (**blue** point). However there is pseudo hill between **red** point and *observer*. The difference is that the line connecting *observer* and **green** point is always greater than the terrain function whereas this is not valid for the line connecting *observer* and **red** point (for x ε [~2.5, ~3.5] red line is under the terrain curve).

**Figure 3 LoS vs non-LoS
**

This was the first point (blocking) we should define. The second one is an easier one related with maximum Euclidean distance between *observer* and *target*. The distance between *observer* and *target* may cause a phase shift in signal if the distance is sufficiently long or depending on weather conditions and terrain properties you may observer diffraction problems (actually there might be more than those). In return this will cause signal quality issues or call drops. On Figure 3, although *blocking* is not an issue between *observer* and **yellow** point, *target* is out of visible range (say 8 units) of *observer*.

You can generate *Figure 3* using the following R script

# Terrain Function height <- function (x) { x*x/3+sin(x)+cos(2*x)+sin(3*x)+cos(4*x)+sin(5*x)+cos(6*x)+3 } # Observer location observer<-c(1.5,8.9) # Generate terrain points with a tolerance of 0.1 x<-seq(from=-0.1,to=6.1,by=0.1) terrainHeight<-height(x) windows() # Draw terrain plot(x,terrainHeight,type='b', xlim=range(x),ylim=range(terrainHeight), main="Line of Sight (LoS)", ylab="Height",xlab="") # Not LoS points(x=c(observer[1],x[41]), y=c(observer[2],terrainHeight[41]), col="red",type='b') # LoS points(x=c(observer[1],x[5]), y=c(observer[2],terrainHeight[5]), col="green",type='b') # LoS but far points(x=c(observer[1],x[length(x)]), y=c(observer[2],terrainHeight[length(x)]), col="yellow",type='b') # Draw Observer points(x=c(observer[1]), y=c(observer[2]), col="blue",pch=10)

# Method to Decide LoS

Finally let’s define a method to find all visible, invisible, and “far” points on any terrain. Since it is not “easy” to decide analytically whether the line connecting *observer* and *target* “is above” the terrain for any terrain function, we will use a simple numeric method.

We will define a *step* size small enough (around *Spatial Tolerance*) to generate all *x* values between *observer* and *target*. *seq* function is a good choice for doing this (Code Lines 33-36). Evaluate these *x* values for line function connecting *observer* and *target *and terrain function. Evaluation is simple for terrain function using *height* function (Code Lines 4-7). Evaluation of line function is held by function *linear* using parametric definition of line function (Code Lines 9-14) . Next step is to search for any *x* value having a line evaluation less than terrain evaluation (Code Line 44-28). The rest is simple as to evaluate euclidean distance and assigning values to *status* variable.

################## # Functions ################## # Terrain Function height <- function (x) { x*x/3+sin(x)+cos(2*x)+sin(3*x)+cos(4*x)+sin(5*x)+cos(6*x)+3 } # Linear Function linear <- function (x, observer, target) { v <- observer - target ((x - observer[1])/v[1])*v[2]+observer[2] } # Linear Function distance <- function (p0,p1) { sqrt(sum((p0-p1)^2)) } ################## # Input ################## # Observer location observer<-c(1.5,9) # Target on terrain target <- c(5, height(5)) # Max visible distance maxVisibleDistance = 4 # Generate points with a step size of 0.1 x <- seq(from=min(observer[1],target[1]), to=max(observer[1],target[1]), by=0.1) # Terrain Height h <- height(x) # y Values y <- linear(x, observer, target) # LoS Analysis aboveTerrain <- round((y-h),2) >= 0.00 # First Rule visible <- !is.element(FALSE,aboveTerrain) if (visible){ # Second Rule d <- distance(observer, target) if(d <= maxVisibleDistance){ status <- "LoS" }else{ status <- "non-LoS due to Distance" } }else{ status <- "non-LoS due to Blocking" }