100% found this document useful (1 vote)
3K views43 pages

Com 114 - ND 1 Statistics For Computing - 2024

The document discusses the meaning and definitions of statistics. It covers topics like primary and secondary data, quantitative and qualitative variables, population and samples. It also defines basic statistical concepts like entity, variable, random variable, discrete and continuous variables. The document provides examples and details for each topic.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
3K views43 pages

Com 114 - ND 1 Statistics For Computing - 2024

The document discusses the meaning and definitions of statistics. It covers topics like primary and secondary data, quantitative and qualitative variables, population and samples. It also defines basic statistical concepts like entity, variable, random variable, discrete and continuous variables. The document provides examples and details for each topic.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 43

LECTURE NOTE FOR STATISTICS FOR COMPUTING.

COM 114

MEANING OF STATISTICS
Definitions of Statistics
Statistics can be defined as a management tool for making decision.
It is a branch of scientific approach to presentation of numerical information in such a way that one will
have a maximum understanding of the reality represented by such information. Statistics is also defined as
the presentation of facts in numerical forms. A more comprehensive definition of statistics shows statistics
as a scientific method which is used for collecting, summarizing, classifying, analyzing and presenting
information in such a way that we can have thorough understanding of the reality the information represents.
However it is a branch of science that deals with the collection, organization, presentation, analyzing,
interpretation and documentation of information for easy decision making.

From all these definitions, you will realize that statistics are concerned with numerical data.. Examples of
such numerical data are the heights and weights of pupils in a primary school when evaluating the
nutritional well being of the pupils and the accident fatalities on a particular road for a period of time.

You should also know that when there are numerical data, there must be non- numerical data such as the
taste of brands of biscuits, the greenness of some vegetables and the texture of some joints of a wholesale
cut of meat. Non- numerical data cannot be subjected to statistical analysis except they are transformed to
numerical data. To transform greenness of vegetables to numerical data, a five point scale for measuring
the colour can be developed with 1 indicating very dull and 5 indicating very green.

The Roles of Statistics

You will realize that statistics is useful in all spheres of human life. A woman with a given amount of
money, going to the market to purchase foodstuff for the family, takes decision on the types of food items
to purchase, the quantity and the quality of the items to maximize the satisfaction she will derive from the
purchase. For all these decisions, the woman makes use of statistics

Government uses statistics as a tool for collecting data on economic aggregates such as national income,
savings, consumption and gross national product. Government also uses statistics to measure the effects of
external factors on its policies and to assess the trends in the economy so that it can plan future policies.

Government uses statistics during census. The various forms sent by the government to individuals and firms
on annual income, tax returns, prices, costs, output and wage rates generate a lot of statistical data for the
use of the government
SMS201 BUSINESS STATISTICS I

Business uses statistics to monitor the various changes in the national economy for the various budget
decisions. Business makes use of statistics in production, marketing, administration and in personnel
management.

Statistics is also used extensively to control and analyze stock level such as minimum, maximum and reorder
levels. It is used by business in market research to determine the acceptability of a product that will be
demanded at various prices by a given population in a geographical area. Management also uses statistics
to make forecast about the sales and labour cost of a firm. Management uses statistics to establish
mathematical relationship between two or more variables for the purpose of predicting a variable in terms
of others. For the conduct and analyses of biological, physical, medical and social researches, we use
statistics extensively.

Basic Concepts In Statistics

Let us quickly define some of the basic concepts you will continue to come across in this course.

• Entity: This may be person, place, and thing on which we make observations. In studying the
nutritional well being of pupils in a primary school, the entity is a pupil in the school.

• Variable: This is a characteristic that assumes different values for different entities. The weights of
pupils in the primary school constitute a variable.

• Random Variable: If we can specify, for a given variable, a mathematical expression called a
function, which gives the relative frequency of occurrence of the values that the variable can assume, the
function is called a probability function and the variable a random variable.

• Quantitative Variable: This is a variable whose values are given as numerical quantities.

Examples of this is the hourly patronage of a restaurant

• Qualitative Variable: This is a variable that is not measurable in numerical form or that cannot be
counted. Examples of this are colours of fiuits, taste of some brands of a biscuit.

• Discrete Variable: This is the variable that can only assume whole numbers. Examples of these are the
number of Local Government Council Areas of the States in Nigeria, number of female students in the
various programmes in the National Open University.

A discrete variable has "interruptions" between the values it can assume. For instance between 1 and 2,
there are infinite number of values such as 1.1, 1.11, 1.111, 1. I I l land so on. These are called

NOUN 2
SMS201 BUSINESS STATISTICS I

interruptions.

• Continuous
Variable: This is a variable that can assume both decimal and nondecimal values. There is always a
continuum of values that the continuous variable can assume. The interruptions that characterize the
discrete variable are absent in the continuous variable. The weight can be both whole values or decimal
values such as 20 kilograms and 220.1752 kilograms.

• Population: This is the largest number of entities in a study. In the study of how workers in Nigeria spend
their leisure hours, the number of workers in Nigeria constitutes the population of the study.

• Sample: This is the part of the population that is selected for a study. In studying the income distribution
of students in the National Open University, the incomes of 1000 students selected for the study, from the
population of all the students in the Open University will constitute the sample of the study.

• Random Sample: This is a sample drawn from a population in such a way that the results of its analysis
may be used to generalize about the population from which it was drawn.

DATA: NATURE, SOURCE AND METHODS OF COLLECTION

Nature of Statistical Data

Primary Data

You are already aware that statistics uses numerical data. The Numerical data can be divided into:

a) Primary and
b) Secondary Data

Primary Data are data collected by or on behalf of the person or people who are going to make use of the
data. It is the data collected specifically for a purpose and used for the purpose for which they are
collected.
Examples of Primary data are:

(i) Heights and weights of students collected to determine their nutritional well being.
(ii) The population of Primary school pupils in the states of the country to allow the Federal Government
plan for the primary education.

NOUN 3
SMS201 BUSINESS STATISTICS I

(ii) The academic performance of students in secondary schools under various types of leaders to know
the leadership style suitable for the secondary schools

The various methods of collecting primary data include surveys, direct interview, direct observation,
questionnaire and experiments, peer group or focus group discussion, census. These will be further
discussed in this unit.

Primary data could be very expensive to collect when the elements of the study sample are widely scattered
and when the items of equipment for data collection, as in many experiments, are capital intensive.
However errors can be minimized when collecting primary data since the researcher can always take
adequate precaution in collecting primary data.

Secondary Data

This is the data that is used by a person or people other than the person or people by whom or for whom the
data was collected. These are the data collected for some other purpose, frequently for administrative
reasons, and used for the purpose for which they were not collected.

Secondary data are always collected from published sources, like textbooks, journals, Newspapers,
magazines, and gazette.

Examples of secondary data include:

(i) Accident fatalities on a particular road over a period of time collected from the Police or Road safety
corps.
(ii) Dietary requirements of various age groups collected from a nutrition textbook.
(iii) Age distribution in Nigeria collected from the publications of the National Population Census.

From the discussion so far, you will know that secondary data are second-hand data. This shows the need to
know as much as possible about the data. In trying to know much about the secondary data, we
need to consider the following
points:

(i) How the data was collected


(ii) How the data has been processed
(iii) The accuracy of the data
(iv) How far the data has been summarized
(v) How comparable the data is with other tabulations.

NOUN 4
SMS201 BUSINESS STATISTICS I

(vi) How to interpret the data, especially when figures collected for one purpose are used for another.

With secondary
data, we usually strike a compromise between what we want and what we are able to find.

Secondary data have the following advantages:

(i) They are very inexpensive to collect since they are readily and abundantly available in
published sources.
(ii) There is a great variety of secondary data on a wide range of subjects.
(iii) Many of these data have been collected for many years hence we can use them to establish trends for
forecasting.
With all these advantages of secondary data, we must use secondary data with great care since such data
may not give the exact kind of information required and the data may not be in the most suitable form they
are required.

Student Assessment Exercise


For each of the Primary and Secondary data list:
(i) The characteristics (ii) The advantages
(iii) The disadvantages

SOURCES OF DATA

1 Micro-Statistical Information

The firms and private organizations, when monitoring the activities of their businesses, produce a lot of
information that are specific to the organization and firm and used for their decision-making processes.
The information produced by these form and organization are called micro-statistical Information.
In micro-economics, the concern is for the household and firm. The information produced or
generated at these levels are micro-statistical information. For firm, such information include those
generated from production, marketing and personnel.

2 Macro-Statistical Information
These are produced by the Public sector of the economy. They are related to the whole country as a whole.
Such information include population, education, rate of inflation, level of unemployment and so on.

Macro-economics is concerned with such aggregates as national income, gross national products, saving,
consumption, gross domestic product and so on. These are macro-statistics information.

NOUN 5
SMS201 BUSINESS STATISTICS I

3 Statistical Information in Business

In The performance
and monitoring of various activities in the firms, a lot of data is produced. The quality and the quantity of
the information generated in the firm depend on the size of the firm and the resources available to the firm

The firm is interested in what is happening to the national economy and what is happening in the industry it
belongs to. You should know that a combination of firms makes the industry. As regards national income,
the firm wishes to know the interest rates and unemployment. As regards the industry, the firm will always
wish to know wage rates, prices, level of output so that it can compare its performance with the
competitors in the industry.

The firm also generates data in the following areas.


(i) Production:
The firm collects data on the results of quality control, prices of raw materials, defectiveness of consignment
of raw materials, wage rates, stock of materials, accident rates, absenteeism rates, unit cost.
(ii) Marketing:
The firm will to want know the budgeted sales, the advertising cost to sustain the sales, the distribution costs
etc.

Other information could be generated from personnel and Accounts departments to aid decision-making
process of the firm.

4 Government Statistics
The governments produce statistics to be able to measure the effects of their policies, to monitor the effects
of external factors on their policies and to be able to assess trends so that they can plan future policies. The
macrostatistical information is generated by the governments as you have learned in this unit. The various
governments rely in the firms and the individuals to generate these macro statistical data

Methods Of Data Collection

1. Surveys

In this unit, you have learned that secondary data are already available; hence they must be collected when
there is the need to use them. The collection of primary data involves survey or inquiry of one type or the
other.

NOUN 6
SMS201 BUSINESS STATISTICS I

Some surveys can be limited in the sense that they can be carried out with a few minutes of observation.
Others can be detailed. When surveys are detailed, information from the surveys are more acceptable and
valued than when
they are limited.

Examples of surveys are:

(i) Government Survey

(ii) Market research surveys - carried out for one particular client and not published in any form.
(iii) Research surveys - carried out by academicians and published in journals
(iv) Firms commission ad hoc surveys on a wide variety of subjects.
Survey methods consist of the following stages:
(i) The survey design - This depends on the objectives of the survey, the available methods, the amount
of money, and time that can
be allocated for collecting the information
(ii) The Pilot survey: This is the preliminary survey carried out on a very small scale to make sure that
the design and methodology of the survey are likely to produce the information required
(iii) Collection of Information - This involves the use of observation, interview and questionnaire.
(iv) Coding - We may need to pre-code the questions to facilitate classification and tabulation.
(v) Tabulation - There is the need to tabulate the data, to give a summary of the data.
(vi) Secondary statistics - We will need to calculate secondary statistics such as means and percentages.
(vii) Reports - Reports must be written on the results and the results must be illustrated with graphs and
diagrams.

2. Observations

This is one of the methods of collecting primary data. It can be used to know the use a particular facility is
put. It can be used to study the behaviours of people in a work place.

There can be the following types of observation:

• Participant observation
• Systematic observation
• Mechanical observation

In participant observation, the observer is involved in the activities he is trying to observe. Examples of
these are the vice-chancellor who participates in the eating at the cafeteria to observe the performance of

NOUN 7
SMS201 BUSINESS STATISTICS I

the cooks and the acceptability of the food by the students; the lecturer who sits for his own paper with the
students to observe the difficulty encountered by the students in the paper. This method can have serious
influence in the
entities the observer is observing. The method may also consume a lot of time.

In systematic observation, the observer does not take part in the activity. The method is used when events
can be investigated without the participants knowing that somebody is observing them. Though the method
is objective, it does not question the motives of people observed.

In mechanical observation, mechanical devices do the observation. For instance, the number of vehicles
passing a particular point on the road can be recorded mechanically. Sophisticated mechanical means such
as television, film and tape recorders are used to provide more complex information. Mechanical
observations can be more effective than those observation made by individuals observer who can be
subject to bias.

Generally, observation has the following problems:

• Objectivity- to remain objective, the observer cannot ask the question that will help him to understand the
events he is observing.
• Selectivity- an observer can unintentionally become selective in perception, recording and reporting.
• Interpretation- the observer may impute meanings to the behaviour of people that the people do not intend.
• Chance- a chance event may be mistaken for a recurrent one.
• Participation- the participation of the observer can influence the behaviour of people being observed.

3. Interviewing

This is a conversation with a purpose. There can be formal and informal interviews. Informally everybody
uses interviews to obtain information.
The formal interview is also initiated by the interviewer who approaches the person he is interested in
interviewing. The interviewer therefore arranges the venue, and the time, and prepares the questions to be
asked. The interviewer
also secures the means of recording the responses.

Interviews are used in a number of situations that include:

(i) Obtaining opinion polls for the success of a candidate in an election.


(ii) Studying why people behave in a way
(iii) Selecting applicants for some jobs

NOUN 8
SMS201 BUSINESS STATISTICS I

(iii) Reporting especially in the radio and television

Before the interview


is conducted, the interviewer must have knowledge of his respondents. The interviewer must also secure
an initial rapport with his respondents. The interviewer must explain clearly and briefly the purpose of the
interview to his respondents.

The interviewer should be very objective. He must not express his own opinions and must not
influence the answers of the respondents. The language of the question must be at the level of the
respondents. If the questions are written in a language different from that of the respondents, the question
must be translated to that of the respondent but the answers must be written by the interviewer in the
language of the questions. This was the case during the last census in Nigeria. The questionnaire used was
written in English language. That is not the language many Nigerians understand. For the purpose of
getting the response of people in this group; the questions were translated to the language they understand.
Interviewing method has a number of advantages:

(i) It allows the interviewer to have personal contact with the respondents therefore allowing more
questions to be asked which improves the quality of the information.
(ii) It allows the interviewer to persuade unwilling respondents.
(iv) It allows experienced interviewer to know when to make calls and recalls

Interviewing method also has a number of disadvantages.

(i) Biased interviewer may influence the responses of the respondents to suit his own opinions.
(ii) A biased respondent, in a matter-affecting ego, may give false responses.
(iii) It is very expensive and time consuming especially when respondents are widely scattered.
(v) It may be difficult to interview some top people in government and business

4. Questionnaire

This is a list of questions drawn in such a way that the questions are related to the objectives of the study
being conducted, and the responses to the question will be analyzed to provide solutions to the problems
we attempt to solve in the study.

There are two types of questionnaire namely:

(i) Structured or fixed-response questionnaire:


(ii) Unstructured or open ended questionnaire

NOUN 9
SMS201 BUSINESS STATISTICS I

The structured questionnaire consists of a list of questions drawn on the study being conducted. Each
question is accompanied by alternative answers from which the respondent picks appropriate answer or
answers. An
example of a structured question is this:

What is your monthly salary

• Below N7500
• N7500 - N10,000 0
• N10, 000 - N12, 500 0
• Above N12,500
Structured questionnaire has a number of advantages:
(i) It is very easy to complete and analyze.
(u) Most of the questions are answered
(iii) The responses are always related to the objectives of the study.

A major disadvantage of the structured questionnaire is the fact that it does not allow the views of the
respondents, which may enhance the quality of the information collected.

Unstructured questionnaire is a list of questions drawn on the study on which information is required. The
questions are not accompanied by alternative answers as in the structured questionnaire. The respondents
are free to provide their own responses.
Example of a question in an unstructured questionnaire is "What is your monthly salary?" Unstructured
questionnaire are not difficult to construct since no question is accompanied by alternative answers. It has
the following advantages.

(i) It allows for the views of the respondents


(ii) The questionnaire are not difficult to construct since no question is accompanied by alternative
answers.
However, the unstructured questionnaire has the following disadvantages.
(i) It is not easy to complete and analyze.
(ii) Many of the responses supplied by the respondents may not be related to the objectives of the study.
(iii) The respondents may not answer all questions, especially those considered difficult by them.
This will definitely reduce the quality of the information obtained.
In drawing a standard questionnaire, the following must be considered.
(1) List all the questions you want to ask.
(ii) Decide whether to use direct or indirect questions or both.
(iii) A question must be limited to an idea.

NOUN 10
SMS201 BUSINESS STATISTICS I

(v) Ask simple and interesting questions before the difficult and uninteresting ones.
(v) State the questions clearly.
(vi) A question
must mean the same thing to all the respondents. (vii) The language must be at the level of the
respondents.
(viii) Do not ask questions that will hurt your respondent
(ix) The questionnaire should be pre-tested on a mock-audience before it is administered on the real
sample to detect the difficulty in completing and analyzing the questionnaire so as to review the
questionnaire before it finally gets to the real audience it is meant for.

Exercise

List the various methods of data collection, their descriptions, their advantages and disadvantages

DATA PRESENTATION
1 .Histogram:

One of the ways of representing a frequency distribution is by means of a histogram. In constructing a histogram we
plot the frequencies of the class intervals against the class boundaries [not the class limit]. The vertical axis is used for
the frequencies and the horizontal axis for the class boundaries.

Example

Suppose the table below shows the distribution of 50 spectators in a secondary school sports competition. You
are required to represent the data with a histogram

NOUN 11
SMS201 BUSINESS STATISTICS I

Age Distribution of spectators in school sports.

Age (Years) Frequency

10-14 2
15-19 3
20-24 5
25-29 7
30-34 11
35-39 8
40-44 6
45-49 5
50-54 3

In solving the exercise, you require a graph sheet; you also need to prepare the class boundaries for the class intervals.
You will then plot the frequency against the respective class boundaries. You still need to recall how the class
boundaries are computed in unit 3 we will not discuss it here.

Age (Years)
Class Interval
Class Boundaries Frequency
10-14 9.5-14.5 2
15-19 14.5-19.5 3
20-24 19.5-24.5 5
25-29 24.5-29.5 7
30-34 29.5-34.5 11
35-39 34.5-39.5 8
40-44 39.5-44.5 6

NOUN 12
SMS201 BUSINESS STATISTICS I

45-49 44.5-49.5 5

49.5-54.5 3

Histogram Showing The Data In Example above

er
F cn
1
1
9
8
7
6
5
4
3
2
1 2 3 Mode 40 5 6

Class Boundaries

The histogram can be used to estimate the mode of the distribution. To do this you have to locate the highest cell in
the histogram, join the upper class boundary of the cell with the upper boundary of the preceding cell, join the lower
class boundary of the highest cell with the lower class boundary of the succeeding cell, locate the intersection,
draw a vertical line from the intersection to the horizontal. The value of the vertical line on the horizontal axis is
the mode. You need to see the construction on the histogram . The mode read from figure is 32.5.

2. Frequency Polygon.

Another way of representing frequency distribution graphically is by the means of a frequency polygon.

In constructing a frequency polygon, we plot the frequency against the class marks. You learned in unit 3 that class
mark of a class interval is the mean of the lower and the upper class boundaries or limits of the class interval.

Example

NOUN 13
SMS201 BUSINESS STATISTICS I

Construct a frequency polygon for the data in example above

To construct the frequency polygon, you need to compute the class mark for the class intervals, you need to make the
polygon touch the horizontal at both ends. To do this, you have to compute the class mark for an imaginary class
interval at the beginning and another imaginary class interval at the end of the distribution.

If you look at table 4.1 that is of interest here, there is no class interval 5-9 at the beginning and there is no class
interval 55-59 at the end of the distribution. We need to bring these intervals in and assign a frequency of 0 to each of
them.

Let us now compute the class mark for the class intervals.

Class Class Marks Frequency


Intervals
5-9 7 0
10-14 12 2
15-19 17 3
20-24 22 5
25-29 27 7
30-34 32 11
35-39 37 8
40-44 42 6
45-49 47 5
50-54 52 3
55-59 57 0

The next activity is to plot the frequency against the class marks. The frequency is on the vertical axis and the
class mark on the horizontal axis. Frequency polygon must be plotted on a graph sheet

NOUN 14
SMS201 BUSINESS STATISTICS I

. 3. Ogive

This is another way of representing frequency distribution graphically. The other name for ogive is cumulative
frequency distribution curve. This curve is very important in the determination of median, quartriles, percentiles, semi-
interquartile range, that will be discussed in some subsequent units.
In plotting ogive for a distribution, you will do the following

(a) Compute the upper class boundaries of all the classes including that of an imaginary class at the beginning of
the distribution.
b) Prepare a cumulative frequency distribution for the data (c) Plot the cumulative frequency
against the upper class boundary.

Example .

For table construct an ogive for that distribution.

Class
Interval Frequency Less than Cumulative Frequency
5-9 0 9.5 0
10-14 2 14.5 2
15-19 3 19.5 5
20-24 5 24.5 10
25-29 7 29.5 17
30-34 11 34.5 28
35-39 8 39.5 36
40-44 6 44.5 42
45-49 5 49.5 47
50-54 3 59.5 50

NOUN 15
SMS201 BUSINESS STATISTICS I

You will realize that the class interval 5-9 was introduced. A frequency of 0 was also assigned to the class interval
since the original table did not show the class. This is done so that the ogive can take its origin from the horizontal
line. From the table you will see that all the values that are less than 24.5 are contained jn class interval 5-9, 10-14,
15-19 and 20-24. The sum of the values which is equal to the cumulative frequency of the interval is 0+2+3+5 = 10.
You should know that ogive can only be plotted on a graph sheet.

Exercises

The frequency distribution of salaries (monthly) of workers in a firm is as follows

Salaries Number of Workers


2,000-7,000 5
7,000-12,000 8
12,000- 24
17,000
17,000- 5
22,000
22,000- 3
27,000
27,000- 2
32,000
32,000- 1
37,000

For the distribution, prepare

(i) a histogram and


(ii) an ogive

Using the histogram, estimate the mode of the distribution

NOUN 16
SMS201 BUSINESS STATISTICS I

4. Bar Chart

4.1 Simple Bar Chart

Another way of representing data graphically is by means of bar chart. A bar chart shows vertical bars with equal
width to represent the values of a variable in some intervals of time. The area of the bar is proportional to the
magnitude of the quantity it represents. The bars must be drawn on graph sheet and they must have equal width.
There are simple components and multiple bar charts. These will be discussed in this unit.

In a simple bar chart, we use the bars to represent the value of a variable in a period of time.

Example

Suppose the monthly sales of a firm for three consecutive months are given as follows

Months Sales (In million of Naira)

January 5.2
February 7.4
March 10.6

You are required to represent the data on a simple bar chart.

NOUN 17
SMS201 BUSINESS STATISTICS I

Simple Bar Chart

11
10
9
8
7
6
5
4
3
2
1
Jan. Feb. March

Months

4.2 Component Bar Charts

Another bar chart that shows the total value for a time period and the values of the components that makeup the total is
the component bar chart. In this case, the bar for the total value for a period is divided into the values for the
components that make up the total.

Example

NOUN 18
SMS201 BUSINESS STATISTICS I

Suppose a hotel has three departments A, B, C from where sales are made and the annual records of the net profit of
the departments for three consecutive years are as presented below.

Years Net Profits in the Departments (Millions of Naira)

A B C
1999 3.2 3.4 2.8
2000 2.8 3.0 2.6
2001 4.0 3.2 3.6

You are to represent the values of the net profits with the aid of a component bar chart.
You will need to plot the values of the components A, B, and C for three years. The first year will show a bar of
length 9.4cm divided into 3.2cm for A, 3.4cm for B, and 2.8cm for C. You will then repeat the exercise for the values
of A, B and C in years 2000 and 2001. There must be a legend to s how the shading of the component.

: Component Bar Chart

11
10
9
8
7
6
5
4
3
2
1
0
Years
1999 2000 2001

NOUN 19
SMS201 BUSINESS STATISTICS I

4.3 Multiple Bar Chart

For the multiple bar chart, each component of every year is presented by a bar whose length is corresponding to the
value of the component.

Example

Using the values of net profit in example 4.5, construct a multiple bar chart for the hotel for the period three years.
In this exercise, you will draw single bar for each component of every year. The bars for a year will now look like
histogram.

Multiple Bar Chart

B
)
ia A

il 4
3
s
r 2
te
1

1999 2000 2001

Year

5. Pie Chart

NOUN 20
SMS201 BUSINESS STATISTICS I

This is another means of representing data graphically. The values of the Items represented with pie chart are
proportional to the area of the sectors that represent them.

In the case of pie-chart, the sectorial angles are computed for the items based on their values and on the total values of
the items.

Sectorial angle of an item = Value of Item X 360°


Total Value of all items

After obtaining the sectorial angles for the items we use a pair of compasses, a pencil,
protector and ruler to draw the angles of the sector.

Example

Suppose the monthly income of a worker is allocated as follows

Items N
Feeding 9625
Rent 4125
Education 5500
Savings 6875
Others 1375
TOTAL 27,5000

You are required to represent the data on a pie char. We will therefore compute the sectoral angles
Items Sectoral Angles
Feeding 126"
Rent 54"
Education 72°
Savings 90"

NOUN 21
SMS201 BUSINESS STATISTICS I

Others 18'

TOTAL 360

For instance the sectoral angle for feeding is calculated as


9625 x 3600 = 1260
27500

The total values of the items is N27, 500. This is to say that the monthly income of the worker is N27, 500. 360° is
used in the calculation because the sum of angles at a point is 360°. If the sectoral angles are computed correctly,
the sum of the angles must be equal to 360°.

: Pie Chart showing Items

It is possible the values of the items are given as percentages of the total values of the items. To find the sectoral
angles, we only need to multiply the respective percentage with 360°

Example

NOUN 22
SMS201 BUSINESS STATISTICS I

Suppose the basic elements of cost of a restaurant are expressed as


percentages of sales as follows.

Elements Cost as % Sales

Food Cost 40
Labour Cost 35
Overhead Cost 15
Net Profit 10
Sales 100

Obtain the sectoral angle for elements of cost.


Sectoral Angles
Food Cost = 40 x 3600 = 1440
100

Labour Cost = 35 x 3600 = 126


100

Net Profit = 1 x 360 = 360


1

SUMMARISING DATA

1 .Ordered Array

When data are collected, they are collected in such a way that there is no particular arrangement of the values. This
unordered array of values does not facilitate the process of analyzing the data. The data must therefore be arranged
either in ascending or descending order of magnitude so as to facilitate the analysis.

Example : Suppose the ages of twenty pupils in a primary school are as follows (age to the nearest years)

NOUN 23
SMS201 BUSINESS STATISTICS I

6, 8, 10, 11, 7, 5, 8, 9, 11, 12. 11, 9, 8, 12, 5, 6, 10, 8, 7, 9.

A look through the values shows no order of arrangement. An ordered form of the values shows the following:

5, 5, 6, 6, 7, 7, 8, 8, 8, 8. 9,9,9,10,10,11,11,11,12,12.
The data arranged in this form has a number of advantages over the raw data.
(i) We can quickly know the lowest and highest values in the data.
(ii) We can easily divide the data into sections.
(iii) We can see whether any value appears more than once in the array.
(iv) We can observe the distance between succeeding values in the data.

Exercise

Go through the values listed below and arrange them in ascending order of magnitude

The daily births in a maternity home in a month are given below

5, 3, 6, 7, 4, 1, 2, 4, 0, 3
3, 5, 2, 6, 2, 5, 0, 7, 2, 5
2, 3, 5, 4, 6, 7, 5, 1, 1, 3

2 .Frequency Distribution
2.1 Frequency Distribution for Ungrouped Data

An ordered array of data does not sufficiently summarize the data. The data can be further summarized by
preparing the frequency distribution for the data.

A frequency distribution is a table showing the values of the data and the number of occurrence of each of the
values.
Example
For the data in exercise above, present the frequency distribution for the values. The values will be represented by Xi
and the frequency is represented by Fi

NOUN 24
SMS201 BUSINESS STATISTICS I

Frequency Distribution

Xi Fi
0 2
1 3
2 5
3 5
4 3
5 6
6 3
7 3
This table shows the values of the variable and their respective frequencies.

2.2 Frequency Distribution of Grouped Data

It is possible to have the frequency in example above without grouping the data because the values are not many.
There are only seven values. In situation where the values are in thousands or millions, it may be difficult to analyze
the values if they are not grouped.
For many of sophisticated analyses, we need to group the data before analysis commences. We therefore group the
data into class intervals

Class intervals: are defined as contiguous, non-overlapping intervals selected in such a way that they are mutually
exclusive and collectively exhaustive. They are mutually exclusive in the sense that a value is Placed in one and only
one class interval.
The class intervals could be 5-9, 10-14, 15-19, 20-24..................... It could also be 11-20, 21-30, 31-
40,.............................
It may even take either form. In this unit there will be more examples of class intervals. The class should not be too
few and should not be too many. Too few class intervals can result in a loss of much detail while too many class
intervals may not condense the

NOUN 25
SMS201 BUSINESS STATISTICS I

Example

The table below shows the scores of 50 students in mathematics in a Senior Secondary Examination

19 50 57 25 61 42 26 33 46 45
63 31 80 36 78 56 38 69 83 40
52 17 35 65 13 63 72 29 56 57
22 45 53 44 76 47 86 55 66 48
41 64 38 43 23 58 55 32 52 46

For this we need to prepare the tallies from the tallies we obtain the frequency of each of the class intervals.

Class Intervals Tallies Frequency

11-20 III 3
21-30 IIII 5
31-40 IIII III 8
41-50 IIII IIII I 11
51-60 IIII IIII 10
61-70 IIII II 7
71-80 IIII 4
81-90 II 2

NOUN 26
SMS201 BUSINESS STATISTICS I

What you have above is the frequency distribution of a grouped data. To further summarize grouped data, some basic
concepts are important. These concepts will be defined and computed now.

(i) Class Limit: For any class interval, there are two class limits, the lower and upper class limits. For the
example 3.2, the lower class limits
are 11, 21, 31, 41, 51, 61, 71, 81. The upper class limits are 20, 30, 40, 50, 60, 70, 80 and 90.

(ii) Class Boundaries: For any class interval, there are two class boundaries. The lower class boundary of
a class interval is the mean of the lower class limit of the interval and the upper limit of the preceeding
interval. Let us compute the lower class boundary of the interval 11-20. The lower class limit of the class interval is
11 and the upper class limit of the preceeding class interval is suppose to be 10. The lower class boundary of the class
interval is therefore:

10+11 = 10.5
2

For the example 3.2, the lower class boundaries of the class intervals are 10.5, 20.5, 30.5, 40.5, 50.5, 60.5, 70.5, and
80.5.

The upper class boundary of a class interval is the mean of the upper class limit of the class interval and the lower class
limit of the succeeding class interval. For example 3.2, the upper class boundary of interval 11-20 is 20.5. The upper
class limit of the class interval is 20 while the lower class limit of the succeeding class interval is 21. The upper class
boundary of the interval is therefore equal to:

20+21 = 20.5
2

For the example, the upper class boundaries of the class intervals are respectively 20.5, 30.5, 40.5, 50.5, 60.5, 70.5,
80.5, and 90.5. From these values, you will see that the upper class boundary of a class interval is the lower class
boundary of the succeeding class interval. The classes can also be given in terms of class boundaries rather than class
limits. When this is done, there is overlapping of the class intervals. For example 3.2, if the class boundaries are used,
we will have the following as our frequency distribution. (iii) Class Width: This is the difference between the
upper and lower

NOUN 27
SMS201 BUSINESS STATISTICS I

Class Intervals Frequency

10.5-20.5 3
20.5-30.5 5
30.5-40.5 8
40.5-50.5 11
50.5-60.5 10
60.5-70.5 7
70.5-80.5 4
80.5-90.5 2

class boundaries (not class limits). For our example, the class width for all the class intervals is 10. For the first
interval, the class width is 20.5-10.5 = 10. It is not 20-11.
(iv) Class Mark: This is the mean of the upper and the lower class boundaries. It can also be the mean of
the lower and the upper class limits. For example 3.2, the class mark for the first interval is:
10.5+20.5 = 15.5. It can also be 11+20 = 15.5

2 2

For the classes in the example, we have the following as class width for the respective class intervals 15.5, 25.5,
35.5 , 45.5, 55.5, 65.5, 75.5, and 85.5. There is need to summarize what we have done so far into class limits, class
boundaries, class marks and class width.

Class Limits Class Boundaries Class Marks Class Width

11-20 10.5-20.5 15.5 10


21-30 20.5-30.5 25.5 10
31-40 30.5-40.5 35.5 10
41-50 40.5-50.5 45.5 10
51-60 50.5-60.5 55.5 10
61-70 60.5-70.5 65.5 10

NOUN 28
SMS201 BUSINESS STATISTICS I

71-80 70.5-80.5 75.5 10


81-90 80.5-90.5 85.5 10

3.0 Relative Frequency

3.1 Ungrouped Data

The relative frequency of a value is defined by the total frequencies of all the values contained in the set of values.

Example. Suppose the frequency distribution of the scores of the twenty students in a test is as presented below

Score Frequenc
s y
2 1
3 2
4 2
5 4
6 5
7 3
8 2
9 1
TOTAL 20

The total frequency is 20. The relative frequency of the first score is given as 1/20 = 0.05

The relative frequency distribution of the scores is therefore given as:

NOUN 29
SMS201 BUSINESS STATISTICS I

Scores Frequency Relative Frequency


2 1 0.05
3 2 0.10
4 2 0.10
5 4 0.20
6 5 0.25
7 3 0.15
8 2 0.10
9 1 0.05

3.2 Grouped Data

For the grouped data, the relative frequency of a class interval is the frequency of the interval divided by the total
frequencies of all class intervals. For example the relative frequency distribution of the class intervals is as presented

Relative Frequency Distribution of Grouped Data

Class Intervals Frequency Relative Frequency


11-20 3 0.06
21-30 5 0.10
31-40 8 0.16
41-50 11 0.22
51-60 10 0.20

NOUN 30
SMS201 BUSINESS STATISTICS I

61-70 7 0.14

71-80 4 0.08
81-90 2 0.04

Going through the relative frequencies table presented in the unit, you will realize that the sum of the relative
frequencies for a set of values is 1

3.3 Cumulative Relative Frequency Distribution

3.3.1 Ungrouped Data

In this unit, you have learned construction of relative frequency for ungrouped data. A further summary of data can be
in form of cumulative relative frequency. To construct the cumulative relative frequency, we need to construct
the cumulative frequency for the data. The cumulative relative frequency of a value is the cumulative frequency of the
values divided by the total frequency of the values contained in the array of data.

For the example 3.3, the cumulative frequency and the cumulative relative frequency for the set of data is as
presented below

Cumulative
Score Frequency Cumulative Relative Frequency
Frequency
2 1 1 1/20 = 0.05
3 2 3 = 1+2 3/20 = 0.15
4 2 5 = 3+2 5/20 = 0.25
5 4 9= =5+4 9/20 = 0.45
6 5 14=9+5 14/20=0.70
7 3 17 = 14+3 17/20 =.85

NOUN 31
SMS201 BUSINESS STATISTICS I

8 2 19 = 17+2 19/20 =.095

9 1 20 = 19+1 20/20 = 1.00

3.3.2 GOUPED DATA


The cumulative frequency of grouped data is constructed the same way as that of ungrouped data.

Example

The table below shows the frequency distribution of the ages of employees in a firm. Prepare the cumulative
frequency and the cumulative relative frequency for the ages of the employees.

Age (In Years) Frequency


18-20 2
21-23 5
24-26 13
27-29 25
30-32 22
33-35 8
36-38 6
39-41 3
42-44 5
45-47 3

48-50 3

NOUN 32
SMS201 BUSINESS STATISTICS I

51-53 2

54-56 2
57-59 1

The cumulative frequency and the cumulative relative frequency distributions are as follows.

Ages Cumulative Cumulative Relative


(Years) Frequency Frequency Frequency
18-20 2 2 =2 2/100 = 0.02

21-23 5 2+5= 7 7/100 = 0.07

24-26 13 7+13 20/100 = 0.20


= 20
27-29 25 20+25= 45 45/100 = 0.45

30-32 22 45+22= 67 67/100 = 0.67


33-35 8 67+8= 75 75/100 = 0.75

36-38 6 75+6 = 81 81/100 = 0.81

39-41 3 81+3 =84 84/100 = 0.84

42-44 5 84+5 = 89 89/100 = 0.89

45-47 3 89+3= 92 92/100 = 0.92

48-50 3 92+3= 95 95/100 = 0.95

51-53 2 95+2= 97 97/100 = 0.97

54-56 2 97+2= 99 99/100 = 0.99

57-59 1 99+1= 100 100/100 = 1.00

NOUN 33
SMS201 BUSINESS STATISTICS I

Assignment

For the table presented below prepare

(i) The frequency distribution for the class interval 11-20, 21-30, 31-40........
(ii) Cummulative Frequency distribution.
(iii) Relative Frequency.
(v) Cummulative Relative Frequency Distribution
Ages of 100 Employees

58 37 21 28 27 24 27 39 38 30
60 20 30 50 44 33 23 26 31 32
18 23 41 32 27 42 40 29 28 34
56 19 28 29 23 47 29 24 31 34
30 19 22 53 49 26 16 38 42 36
41 32 33 20 31 36 32 32 31 32
21 24 55 21 24 34 37 29 33 32
32 49 38 48 33 43 26 38 30 28
24 46 15 43 43 23 23 28 34 28
41 23 25 19 51 23 36 31 35 31

MEASURE OF CENTRAL TENDENCY

The Arithmetic Mean

The Arithmetic Mean of Ungrouped Data

NOUN 34
SMS201 BUSINESS STATISTICS I

We use average many times to mean the arithmetic mean. We compute arithmetic mean for
both ungrouped and grouped data. We also compute arithmetic mean, which we henceforth in

this unit call mean, for both the sample and the population from where the sample is drawn.
Mean computed for the sample is called a statistic and it is donated by x . The mean computed
for the population is called a parameter and is denoted by µ You should note in this course that
any measure computed for the sample is called statistic and any measure computed for the
population is called parameter.

The mean of ungrouped data is the summation of the values in the set of data divided by the
number of values in the set of data.

For a sample the number of values is denoted by n, that is the sample size, and for the
population the population size is given as N.
-
Mean of ungrouped data (if a sample) is

∑x
x = n

∑ = Summation

×i = values of a variable

n = Sample size

Example 5.1
Compute the mean for the following:

Scores: 7, 5, 8, 10, 11, 6, 3 , 4,

10, 9,13, 2.

NOUN 35
SMS201 BUSINESS STATISTICS I

Mean=x = 7+5+8+10+11+6+3+4+10+9+13+2

12

= 88 = 7.33
12

You will realize that there is no frequency distribution for this example. Suppose
there is a frequency distribution for the values of a variable, then how do we calculate
the mean?

This is simple. If we are computing the mean for a sample, that is x, the mean

x = ∑fixi or ∑fixi, ∑fi or n

fi = different frequencies of values in the set of data xi =


different values in the set of data

∑f i or n is equal to the number of values in the set of data.

Example 5.2

The table below gives the frequency distribution of the mark scored by 20 students in
a test conducted in Statistics for Management

Scores frequency

11 2
12 3
13 5
14 7
15 2
NOUN 16 1 36
SMS201 BUSINESS STATISTICS I

Scores Frequency

 You are required to calculate the mean of the scores

To solve the problem, you should


(i) Multiply each score by its respective frequency to obtain fixi(ii) Add up the
products of each score and its respective frequency to obtain ∑fixi
(iii) Add up all the frequencies to obtain ∑fi or n.

(iv) Divide ∑fixi by ∑fi to obtain the mean.

Scores (xi) Frequency (Fi) Fixi


11 2 22
12 3 36
12 5 65
14 7 98
15 2 30
16 1 16

Total 20 276

276
Mean = x = ∑fixi = 20 = 13.35

There can be another method of computing the mean apart from using

∑ fidi
we use the Assumed mean, A, then the mean x = A +
∑ fi
where A = Assumed mean

NOUN 37
SMS201 BUSINESS STATISTICS I

di = Xi-A for all the values of Xi

Example , let us compute the mean again using an assumed mean of 13.

We will then have the table below for the solution of the problem.

Scores (Xi) fi di = Xi – A fidi


11 2 11-13 =-2 -4
12 3 12-13 = -1 -3
13 5 13-13 = 0 0
14 7 14-13 = 1 7
15 2 15-13 = 2 4
16 1 16-13 = 3 3

Total 20 7
Where Ass
ASSUMED MEAN (A) = 13
Mean = x = 13 + 7 = 13 + 0.35 = 13.35
20
You will see that with the method of assumed mean, we should obtain the same value
of mean we had before.

3.1.2 The Arithmetic Mean of Grouped Data

You have learned how to compute the arithmetic mean of ungrouped data.
When the values are many in a set of data, there is the need to group them into class
intervals. You learned about this in unit 3. We need to take some time to compute
the mean of grouped data.

NOUN 38
STA 101 BUSINESS STATISTICS I

Example
The earning per share (in kobo) of some firms is presented below with the frequency
distribution.

Earning per share (in kobo) Frequency


65-69 3
70-74 4
75-79 11
80-84 15
85-89 9
90-94 5
95-99 3

You are required to calculate the mean of the distribution.

To solve this question, we need to compute the class mark for the class intervals. The
class mark becomes the Xi we will use in the computation. Immediately this is done,
the whole distribution is reduced to the form of an ungrouped data with frequency
distribution. You should recall that class mark is the mean of the upper and lower
class boundaries (or Limits) of a class interval.

Class Mark Frequency ∑Xifi


I
Xi fi Fixi
67 3 201
72 4 288
77 11 847
82 15 1230
87 9 783
92 5 460
97 _3 291
TOTAL 50 4100

Mean x = 4100 = 82

50
STA 101 BUSINESS STATISTICS I

Example : Using the assumed mean method, compute the mean of the distribution in
example

We will still make use of the frequency distribution of the class mark

Class Mark Frequency A=77


xi fi di=Xi-77 Fidi
3 -10 -30
67
72 4 -5 -20
77 11 0 0
82 15 5 75
87 9 10 90
92 5 15 75
97 3 20 60
TOTAL 50 250

We assume a mean of 77

x =A+ ∑fidi

x = 250 = 82
50

Advantages And Disadvantages Of Arithmetic Mean

Advantages of Arithmetic Mean.

The arithmetic mean has the following advantages

(i) Mean is the best known of all the averages


(ii) Mean can be used for further mathematical process. Mean is used to perform
statistical procedures such as estimation and hypothesis testing.
(iii) Mean is unique, unlike mode (this will be discussed later], because a set of
data has one and only one mean.

NOUN 40
STA 101 BUSINESS STATISTICS I

3.3.2 The disadvantages of mean

Arithmetic mean has the following disadvantages.

Assignment Since all the values in a set of data are used to compute the mean, the mean can be
influenced by extreme values

For instance the mean of 3, 4, 5, 6 and 7 is 3+4+ 5+6+7 = 5


5
There is no extreme value here. Suppose we have 3, 4, 5, 6, 7, 19, we then have an
extreme value, 19. The mean becomes

3 + 4 +5 + 6 +7 +19 = 44 = 7.33
5 5

The extreme value has greatly influenced the mean.


(i) A mean may result into an impossible value where the data is discrete.
For instance, If the number of female students in five programmes at the
National Open University are 35, 38, 42, 53, and 66, the mean value of the
female students will be

35+38+42+53+66 = 234 = 46.8 Students


5 5

This is an impossible value.

(ii) We are unable to compute mean for data in which there are open- ended
classes either at the beginning of the distribution or at the end of the
distribution. It will be difficult to know the class mark of the open-ended
class.

From the frequency distribution table shown below

(i) Compute the mean using the two methods in this unit
(ii) Compare your result and comment

Class Frequency
8.0-8.9 5
9.0-9.9 7
10.0-10.9 10
11.0-11.9 13
12.0-12.9 15
13.0-13.9 5
14.0-14.9 3
15.0-15.9 2
NOUN 41
STA 101 BUSINESS STATISTICS I

GEOMETRIC MEAN

Computation Of Geometric Mean

For values x l, x2, x3, . . . xn the geometric mean is the nth root of the product of the values.
The geometric mean is denoted as GM therefore

GM = √n X 1∗X 2∗X 3∗.. ..∗X n

Where GM is geometric mean x1, X2, x3, . .. xn are values of the variable of interest, while n
represents the sample size.

(i) Multiply the values all together to get the product


(ii) Take the nth root of the product

Example

What is the geometric mean of 3, 5, 6, and 7?

GM = √4 3∗5∗6∗7

= √4 630 = 5.01

HARMONIC MEAN

Computation Of Harmonic Mean.

For values xl, x2, x3 ... xn

TThe harmonic mean is given as ∑¿...)

NOUN 42
STA 101 BUSINESS STATISTICS I

For this we do the following:

(1) We add up the reciprocals of the values


(2) Divide the sum into the number of items

Example 6.5

Compute the harmonic mean of 5, 6 and 7

Harmonic mean =

3 = 3 = 5.88
0.2 + .0167 + 0.143 0.5099

The harmonic mean can also be used to obtain the average of different speeds

Uses Of Harmonic Mean

The harmonic mean is used to average ratios, speeds etc. It is used mostly in engineering.

Exercise 6.2

Compute the harmonic mean of the following values


5, 7, 9, 11, 13, and 15

NOUN 43

You might also like