Identifiers and Data Objects

Identifiers and Data Objects

This chapter will deal with basic elements which are used to create a C program. These elements are, the valid character set, identifiers, keywords, basic data types and their representation, constants and variables.


C uses the uppercase English alphabets A to Z, the lowercase letters a to z, the digits 0 to 9, and certain special characters as building blocks to form basic program elements (e.g., constants, variables, operators, expressions). The special characters are listed below.

! * + \ <

# ( = | { >

% ) ~ ; } /

^ - [ : , ?

& - ] . (blank)

Table 2.1 : C character set

Most versions of the language also allow some other characters, such as @ and $, to be included within strings and comments.

In addition, certain combinations of these characters, such as ‘\b’, ‘\n’ and ‘\t’, are used to represent special conditions such as backspace, newline and horizontal tab, respectively. These character combinations are known as escape sequences.


Identifiers are names given to various items in the program, such as variables, functions and arrays. An identifier consists of letters and digits, in any order, except that the first character must be a letter. Both upper and lowercase letters are permitted. Upper and lowercase letters are however not interchangeable (i.e., an uppercase letter is not equivalent to the corresponding lowercase letter). The underscore character (_) can also be included, and it is considered to be a letter.

Keywords like if, else, int, float, etc., are reserved and they cannot be used as identifier names.

Example 2.1 :

The following names are valid identifiers.

x y12 speed employee_name

area Perimeter TABLE

The following names are not valid identifiers for the reasons stated.

5th the first character must be a letter

“Calcutta” illegal characters (“)

project-code illegal character (-)

error symbol illegal character (blank space)

Although an identifier can be arbitrarily long, most implementations recognize typically 31 characters. There are some implementations which recognize only eight characters. The ANSI standard recognizes 31 characters.

1. Identify which of the following are valid identifiers.

(a) data 1 (d)break (g)Your name and current address (j)987-65-4321

(b) 1data (e) %tax (h) Your_name_and_current_address

(c) data_1 (f) _name (i) Your-name-and-current-address


There are only few basic data types in C. These are listed in Table 2.2.

Basic data types :

Data type


Size in bytes



single character

1 byte



integer number

4 bytes.

-2147483648 to 2147483647


single precision floating point number (number containing fraction or an exponent)

4 bytes.

3.4E-38 to 3.4E+37


double precision floating point number

8 bytes.

1.7E-308 to 1.7E+307

Table 2.2 : Details of fundamental data types

The list of data types can be increased by using the data type qualifiers short, long, signed and unsigned. For example, an integer quantity can be defined as long, short, signed or unsigned integer. The memory requirement of an integer data varies depending on the compilers used. The qualified basic data types and their sizes are shown in table 2.3. Note that the qualifier unsigned can be used along with the qualifiers short and long. The unsigned int occupies the same memory space as an ordinary int but differs on the possible content of the left-most bit. In case of an ordinary int it is reserved for sign (sign bit), but all the bits in an unsigned int are used for determining the magnitude of the integer quantity.

The char type is used to represent individual characters, and occupies one byte of memory. Each character has an equivalent integer representation (since each stores internally the ASCII value for the corresponding character). So char variables or constants can be used as integer data in arithmetic expressions.

Data type

Size in bytes


short int

2 bytes.

-32768 to 32767

long int

4 bytes

-2147483648 to 2147483647

unsigned short int

2 bytes

0 to 65535

unsigned int

4 bytes

0 to 4294967295

unsigned long int

4 bytes

0 to 4294967295

long double (Extended Precision)

8 bytes.

1.7E-308 to 1.7E+307

Table 2.3 : Details of qualified data types

The data objects to be manipulated in a C program are classified as variables and constants. The type of all the variables to be used in the program must be declared before they can be used. The operations that can be performed on the data objects are specified by a set of operators. Expressions used in a program combine the variables, constants and operators to produce new values.


The constants in C can be classified into four categories namely integer constants, floating point constants, character constants and string constants.

A character constant is written as ‘A’ (always enclosed in single quotes).

The examples of string constants are like “Calcutta”, “A” etc. Note that a string constant is always enclosed within double quotes.

A normal integer constant is written as 1234.

A long int is recognized by the presence of L(uppercase or lowercase) at the end of the constant, e.g. 2748723l.

The suffix u or U signifies the int to be an unsigned one.

The UL or ul at the end indicates the int quantity is of unsigned long type.

Floating point constants contain a decimal point (167.903) or an exponent (1e-2) or both. Their type is double unless suffixed. The suffix of f or F indicates float; l or L long double.

C also supports octal and hexadecimal data. The value of an integer data can be specified in either octal or hexadecimal form. A hexadecimal constant must begin with 0x or 0X, a leading 0 indicates the octal representation. Octal and hexadecimal constants may also be followed by U to indicate unsigned or L to determine long.

The number 0x2A5 is an example of a hexadecimal number. Internally the number is represented by the following bit patterns,

0x2A5 = 0010 1010 0101 = 2 * 162 + 10 * 161 + 5 * 160 = 677

2 A 5

The number 677 is the decimal equivalent of the number 0x2A5.

The example of an octal number can be 0347. To represent each digit of an octal number in the form of binary, we need maximum of three bits since the last digit in the octal number system is 7.

0347 = 011 100 111 = 3 * 82 + 4 * 81 + 7 * 80=231(in decimal)

3 4 7

In numeric constants e.g. integer or floating point constants, blanks and any non-numeric characters cannot be included. The range of these constants will be limited by the maximum and minimum bounds (usually machine dependent).

A character constant is a single character enclosed in apostrophes such as ‘A’. Such a constant is internally treated as an integer e.g. ‘A’ corresponds to the integer 65 (also known as ASCII value). The ASCII value of ‘a’ (small) is 97. Hence character constants can participate in a numeric calculation just like any other integer, moreover, they also can be used for comparing with other character constants. Some character constants are of non-printing type which can be expressed by a combination of a back-slash(\) and special characters. They are known as escape sequences. Each of them represents single character even though they are written in terms of two or more characters.

Commonly used escape sequences are listed below :


Escape Sequence














carriage return



vertical tab



horizontal tab



Table 2.4 : C escape key sequences.

Symbolic Constants

Constants which play crucial roles in a program can be made more meaningful by assigning them appropriate names to make the program more readable and easily changeable. These constants are called symbolic constants and are defined as follows.

Example 2.2 :

# define PI 3.141593

# define TRUE 1

# define PROMPT “Enter Your Name :”

PI, TRUE and PROMPT are symbolic constants, so they do not appear in the declaration. #define is a preprocessor directive like #include in Example 1.1.

5. Write an appropriate definition for each of the following symbolic constants, as it would appear within a C program.

Constant Text

(a) AMOUNT -75

(b) EPSILON 0.0001

(c) BEGIN {


(d) FIRSTNAME “Malabika”

(e) ENDOLIN ‘\n’

(f) COST “Rs.109.95”


In a C program all variables must be declared before they are used. A declaration determines the type of the data, and contains a list of one or more variables having the same data type.

Example 2.3 :

int count, index;

char flag,text[80];

short int a, b;

unsigned int p;

double d;

A variable can be initialized with values at the time of declaration.

Example 2.4 :

int c=5;

char reply=‘Y’;

double d=4.64386237445675;

char state[]=“WEST BENGAL”;

float eps=1.0e-5;