FINTECH QBank MCQ
FINTECH QBank MCQ
1) Which of the following is NOT considered a key driver of the fintech revolution?
a) Rising internet penetration
b) Technological advancements like AI and blockchain
c) Increased regulatory burden on traditional banks
d) Growing demand for personalised financial services (Correct Answer)
2) Which of the following technologies is NOT primarily associated with blockchain in the context of
fintech?
a) Distributed ledger
b) Cryptocurrencies
c) Smart contracts
d) Centralised databases (Correct Answer)
6) Payments & Digital Wallets: Compared to traditional payment methods, mobile wallets generally offer.
a) Lower security and higher susceptibility to fraud
b) Faster and more convenient transactions (Correct Answer)
c) Wider acceptance by merchants
d) Higher transaction fees
7) Which of the following is NOT a common security concern associated with contactless payments?
a) Interception of radio frequency signals by criminals
b) Skimming of data from physical payment terminals
c) Phishing attacks targeting mobile wallet users (Correct Answer)
d) Malfunctioning of contactless payment chips
19) Fintech plays a role in disaster relief and development aid by:
a) Facilitating cross-border donations and resource allocation (Correct)
b) Providing financial services to displaced populations
c) Developing early warning systems for natural disasters
d) Building infrastructure for digital transactions in remote areas
21) Blockchain technology can be used for supply chain management by:
a) Tracking the movement of goods and ensuring product authenticity (Correct Answer)
b) Streamlining logistics and optimising inventory levels
c) Providing financing solutions for small and medium-sized businesses
d) Eliminating the need for intermediaries and middlemen
Cyber Security:
22) Which of the following cyberattacks is most likely to target a fintech company?
a) Ransomware attacks on critical infrastructure (Correct Answer)
b) Denial-of-service attacks against online banking platforms
c) Spear phishing emails targeting executives
d) Physical attacks on data centres
Blockchain:
26) Blockchain technology is not suitable for:
a) Tracking the movement of goods in a supply chain
b) Facilitating secure and transparent cross-border payments
c) Creating and managing digital identities
d) Centralised government systems requiring high efficiency and control (Correct)
Risk Management:
33) Operational risk in fintech encompasses:
a) Market fluctuations and changes in interest rates
b) System failures, technical errors, and human mistakes (Correct)
c) Compliance with regulations and legal requirements
d) Fraudulent activities and cyberattacks
34) Credit risk in the context of fintech lending platforms refers to:
a) The potential for borrowers to default on their loans (Correct)
b) The risk of cyberattacks targeting loan data and borrower information
c) The financial stability of the platform itself
d) The compliance of lending practices with ethical standards
36) Regulatory technology (RegTech) can aid risk management in fintech by:
a) Lobbying for relaxed regulations in the financial sector
b) Automating compliance tasks and reporting processes (Correct)
c) Challenging established regulatory frameworks
d) Replacing human expertise and supervision with algorithms
Supply Chain:
37) Blockchain technology can improve transparency and traceability in supply chains by:
a) Creating a decentralised and tamper-proof record of transactions (Correct)
b) Streamlining logistics and optimising inventory management
c) Eliminating the need for physical warehouses and distribution centres
d) Replacing trade agreements and international trade regulations
38) Challenges to adopting blockchain in supply chain management include:
a) Limited scalability and transaction processing speeds of some networks (Correct)
b) Lack of standardisation and interoperability between different blockchain platforms
c) Resistance from established players in the supply chain ecosystem
d) Technical complexity and lack of skilled personnel to manage blockchain implementations
39) Implementing a blockchain-based supply chain solution can benefit companies by:
a) Increasing costs and operational complexity
b) Improving efficiency, reducing fraud, and enhancing brand reputation (Correct)
c) Gaining an unfair advantage over competitors with traditional supply chains
d) Creating unnecessary data silos and information fragmentation
41) Responsible sourcing of materials and ethical labor practices are important considerations when
evaluating a blockchain-based supply chain solution because:
a) They are unrelated to the core functionalities of the technology
b) They can impact stakeholder trust and brand reputation (Correct Answer)
c) They are only relevant for niche industries with ethical concerns
d) Blockchain technology automatically guarantees ethical and sustainable practices
46) Which of the following is NOT a cryptographic technique used in blockchain security?
a) Hashing
b) Digital signatures
c) Encryption
d) Tokenization
Explanation: Tokenization converts assets into digital tokens on a blockchain, but it primarily involves
asset representation and transfer rather than directly contributing to cryptographic security. While
tokens may be encrypted for secure storage, the core security mechanisms rely on hashing, digital
signatures, and consensus algorithms.
Hyperledger: (Blockchain)
51) Hyperledger Fabric is an example of a:
a) Public blockchain
b) Private blockchain
c) Consortium blockchain (Correct)
d) Hybrid blockchain
Explanation: Hyperledger Fabric is designed for permissioned networks with a defined group of
participants, making it a consortium blockchain.
Python:
54) Python is commonly used in fintech for:
a) Data analysis and machine learning
b) Web development and API integration
c) Quantitative trading and financial modelling
d) All of the above (Correct)
Explanation: Python is versatile and widely used in fintech for data analysis, web development,
quantitative modelling, and more.
55) Which Python library is specifically designed for financial data analysis?
a) NumPy
b) Pandas (Correct)
c) Scikit-learn
d) TensorFlow
Explanation: Pandas offers powerful tools and functionalities specifically tailored for financial data
analysis and manipulation.
Blockchain - Hyperledger
61) Which of the following statements is TRUE about Hyperledger Fabric compared to Hyperledger
Sawtooth?
a) Fabric prioritises scalability and permissioned networks, while Sawtooth focuses on privacy and
permissionless consensus. (Correct)
b) Fabric is better suited for enterprise applications requiring confidentiality, while Sawtooth excels in
public blockchain scenarios.
c) Fabric requires more complex development tools compared to the user-friendly nature of
Sawtooth.
Explanation: Fabric prioritizes scalability and permissioned networks, making it ideal for enterprise
applications requiring controlled access and high transaction throughput. Sawtooth focuses on privacy
and uses a permissionless consensus mechanism, making it suitable for scenarios where anonymity and
data ownership are crucial.
62) Which of the following challenges poses the most significant barrier to widespread adoption of
Hyperledger in the financial sector?
a) Lack of standardization and interoperability between different Hyperledger projects. (Correct)
b) Limited developer pool with expertise in Hyperledger platforms and blockchain technology.
Regulatory uncertainty surrounding the legal and compliance implications of blockchain in finance.
d) High cost of implementing and maintaining Hyperledger-based solutions compared to traditional IT
infrastructure.
Explanation: The fragmented ecosystem with distinct Hyperledger projects lacking consistent
interoperability hinders widespread adoption. Standardisation efforts are underway, but bridging the
gap remains a critical challenge for large-scale implementation in the financial sector.
67) How can integrating IoT devices with mobile wallets enhance payment security in offline transactions?
a) Enable contactless payments without needing physical cards.
b) Generate unique transaction codes based on device and user interaction. (Correct)
c) Offer instant rewards and loyalty programs based on purchase data.
d) Track user location for targeted advertising and personalised offers.
Explanation: Integrating IoT devices with mobile wallets can generate unique transaction codes based
on device and user interaction, adding an extra layer of security beyond traditional PINs or passwords
for offline payments.
68) Major challenge in utilizing blockchain for real-time micropayments in the Internet of Things (IoT) is:
a) High energy consumption associated with mining.
b) Scalability limitations of certain blockchain networks. (Correct)
c) Lack of standardization and interoperability between different IoT platforms.
d) Security vulnerabilities inherent in wireless communication protocols.
Explanation: The limited scalability of some blockchain networks, struggling to handle a high volume of
small transactions efficiently, becomes the primary challenge for real-time micropayments in the
context of a vast network of IoT devices.
71) What ethical concern arises when considering the use of AI-powered facial recognition technology in
fintech applications for identity verification?
a) Potential for inaccurate or biased results based on training data. (Correct)
b) Increased risk of data breaches and unauthorized access to sensitive information.
c) Lack of user control over their biometric data and privacy concerns.
d) High cost of implementation and maintenance compared to traditional verification methods.
Explanation: The potential for bias and discriminatory outcomes based on training data when utilizing
AI-powered facial recognition raises significant ethical concerns about fairness and inclusivity within
fintech applications.
Payment-related Topics:
72) Which potential benefit of Central Bank Digital Currencies (CBDCs) is most likely to be met with
resistance from established commercial banks?
a) Enhanced financial inclusion and access to banking services.
b) Increased efficiency and speed of cross-border payments.
c) Reduced reliance on private payment intermediaries and decreased revenue streams for banks.
(Correct)
d) Improved transparency and traceability of financial transactions.
Explanation: The potential for CBDCs to reduce reliance on traditional banks for financial services
poses the most significant threat to their current revenue streams and market dominance. While
other benefits exist, this specific aspect is most likely to trigger resistance from established
commercial banks.
73) Open Banking APIs raise concerns about user data privacy. Which of the following is NOT a valid
strategy to mitigate these concerns?
a) Implementing strong data security and access control measures.
b) Providing users with transparent information about how their data is used.
c) Allowing users to granularly control which data is shared with third-party apps.
d) Requiring third-party app developers to adhere to strict data privacy regulations.
Explanation: All of these strategies contribute to mitigating data privacy concerns in Open Banking.
There is no option that directly opposes data protection measures.
74) The rapid adoption of contactless payments in recent years can be attributed to:
a) Increased security due to chip-and-PIN technology.
b) Convenience and speed of transactions compared to traditional methods. (Correct)
c) Lower transaction fees for merchants compared to card payments.
d) Enhanced loyalty rewards and promotional offers via mobile wallets.
Explanation: The convenience and speed of contactless payments, offering a faster checkout
experience and eliminating the need for physical card swipes, have been the main drivers of their
widespread adoption.
76) How can blockchain technology be used to improve healthcare data security and patient privacy in
electronic health records (EHRs)?
a) Enable secure and auditable sharing of medical records between healthcare providers. (Correct)
b) Facilitate direct payments between patients and healthcare providers, bypassing insurance
companies.
c) Streamline administrative tasks and reduce paperwork in healthcare institutions.
d) Allow patients to monetize their medical data by selling it to pharmaceutical companies for
research purposes.
Explanation: Blockchain's ability to create an immutable and transparent record of medical data access
and modifications ensures improved security and patient control over their sensitive information
within EHRs.
Environmental Sustainability:
77) Which fintech solution can incentivize consumers to adopt sustainable practices and reduce their
environmental footprint?
a) Micro-investing platforms focused on renewable energy and green technology companies.
b) Carbon footprint tracking apps that connect user spending habits to environmental impact.
c) Decentralized finance (DeFi) protocols offering tokenized carbon credits for offsetting emissions.
d) Gamified savings apps that reward users for adopting eco-friendly behaviors like cycling or using
public transportation. (Correct)
Explanation: Gamified savings apps that incentivize and reward users for adopting sustainable practices
directly encourage positive behavioral changes, making them a strong solution for environmental
awareness.
78) Which of the following is the correct way to print "Hello, world!" in Python?
a) println("Hello, world!")
b) print("Hello, world!") (Correct)
c) echo("Hello, world!")
d) puts("Hello, world!")
Explanation: The print() function is used for displaying output in Python.
DataFrames:
83) What is the primary data structure used in pandas for data analysis?
a) Series
b) DataFrame (Correct)
c) Array
d) List
Explanation: DataFrames are two-dimensional, labeled data structures with columns that can hold
different data types.
84) How would you select the first column of a DataFrame named df?
a) df.column1
b) df[0] (Correct)
c) df.select("column1")
d) df.head(1)
Explanation: Square brackets are used to select columns by name or index.
Visualizations:
86) What type of plot would you use to visualize the distribution of a single variable?
a) Scatter plot
b) Line plot
c) Histogram (Correct)
d) Box plot
Explanation: Histograms show the frequency distribution of numerical data.
Functions:
87) Which keyword is used to define a function in Python?
a) def (Correct)
b) function
c) create
d) procedure
Explanation: The def keyword is used to define a function in Python.
Control Flow:
89) Which keyword is used to create an if-else statement in Python?
a) if...then...else (Correct)
b) choose...when...otherwise
c) select...case...default
d) simply use brackets without keywords
Explanation: The if...else statement is used for conditional execution in Python.
Loops:
91) Which loop statement executes a block of code repeatedly for a fixed number of times?
a) while
b) for
c) range
d) None of the above
Explanation: The for loop iterates a number of times specified by a range or iterable.
92) How would you iterate through each element in a list named numbers?
a) for element in numbers: print(element)
b) for i in range(len(numbers)): print(numbers[i])
c) Both of the above are correct.
d) Neither of the above is correct.
Explanation: Both options a and b achieve the same result: iterating through each element in the list.
94) How would you create a new column in a DataFrame named "total_cost" by adding the values in two
other columns named "price" and "tax"?
a) df["total_cost"] = df["price"] + df["tax"]
b) df.add_column("total_cost", formula="price + tax")
c) df["total_cost"] = df.apply(lambda row: row["price"] + row["tax"], axis=1)
d) All of the above
Explanation: All three options achieve the same result: creating a new column by applying a calculation
to existing columns.
96) How would you add a color-coded legend to a seaborn heatmap based on values in a column named
"category"?
a) heatmap(data, hue="category")
b) heatmap(data).color_legend(title="Category")
c) heatmap(data).add_legend(title="Category")
d) Both a and c are correct.
Explanation: Option a is the correct way to specify the "hue" parameter for the heatmap that uses
values from the "category" column to create the color legend.
99) What function is used to read the entire contents of a text file into a string?
a) read()
b) read()
c) readlines()
d) readall()
Explanation: The read() function, when called without arguments, reads the entire file contents into a
single string.
102) Which module is commonly used to work with CSV files in Python?
a) fileio
b) csv
c) csvreader
d) csv
Explanation: The csv module provides easy-to-use functions for reading and writing CSV files, handling
data separation and formatting.
103) How do you create a new CSV file and write data to it?
a) csv.create()
b) csv.write()
c) csv.writer()
d) csv.new()
Explanation: The csv.writer() function creates a writer object that allows writing data to a CSV file in a
structured format.
104) Which library offers high-level functionalities for reading and writing various data formats?
a) pandas
b) fileio
c) os
d) csv
Explanation: Option a is correct. Pandas provides convenient functions for reading and writing various
data formats like CSV, Excel, and HDF5, offering advanced data manipulation capabilities.
Descriptive Analytics:
105) Which Python library is commonly used for descriptive statistics and data exploration?
a) NumPy
b) Pandas
c) Matplotlib
d) Scikit-learn
Expliaination: b) Pandas is the primary library for data manipulation and analysis in Python, offering
rich functions for statistics and data exploration.
106) What method in pandas would you use to calculate summary statistics for a financial dataset?
a) describe()
b) summarize()
c) stats()
d) analyze()
Explanation: a) describe() method in Pandas provides comprehensive summary statistics for numerical
columns, including mean, median, standard deviation, etc.
Diagnostic Analytics:
107) Which technique can help identify outliers in financial data?
a) Correlation analysis
b) Box plots
c) Linear regression
d) Decision trees
Explanation: b) Box plots effectively identify outliers by depicting quartiles and extreme values,
highlighting unusual data points in financial transactions or market trends.
108) To investigate relationships between multiple financial variables, which visualization would be most
suitable?
a) Scatter plot
b) Histogram
c) Bar chart
d) Pie chart
Explanation: a) Scatter plot allows visualizing relationships between two continuous variables like
stock prices and market indicators, revealing potential correlations and trends.
Predictive Analytics:
109) Which machine learning algorithm is commonly used for predicting stock prices?
a) Linear regression
b) Logistic regression
c) Decision trees
d) Support vector machines
Explanation: a) Linear regression is a popular choice for predicting continuous variables like stock
prices due to its interpretability and ability to model linear relationships between features.
110) Which metric is often used to evaluate the accuracy of a predictive model for financial data?
a) Mean squared error (MSE)
b) Accuracy
c) Precision
d) Recall
Explanation: a) Mean squared error (MSE) measures the average squared difference between predicted
and actual values, serving as a good metric for continuous financial data like price predictions.
111) What are some challenges of applying Python analytics to financial data?
a) Data acquisition and cleaning challenges
b) Market volatility and non-linear relationships
c) Ethical considerations and bias in models
d) All of the above (Correct)
114) Which method can be used to remove rows with missing values?
a) dropna()
b) fillna()
c) replace()
d) remove_missing()
Explanation: dropna() is specifically designed to remove rows or columns containing missing values.
115) How would you replace missing values with the mean of a column?
a) fillna(mean())
b) replace(NaN, mean())
c) impute(mean())
d) fillna(mean, inplace=True)
Explanation: fillna(mean()) fills missing values with the calculated mean of the column.
118) How would you calculate the average return of a stock over a period?
a) mean(prices)
b) cumsum(prices)
c) (last_price - start_price) / start_price
d) All of the above
Explanation: Option c correctly calculates the average return as the change in price relative to the
starting price.
119) What are some challenges of performing statistical analysis on financial data?
a) Data quality and missing values.
b) Non-linear relationships and market volatility.
c) Choosing the right statistical tests and interpreting results.
d) All of the above.
Explanation: Option d covers all major challenges. Financial data can be messy, relationships may not
be linear, and selecting and interpreting appropriate tests requires careful consideration.
121) What are some ethical considerations when using Python for financial analysis or trading?
a) Avoiding data bias and ensuring transparency in model development.
b) Mitigating potential market manipulation through responsible algorithmic trading.
c) Protecting sensitive financial data and preventing cyberattacks.
d) All of the above.
Explanation: Option d emphasizes various ethical concerns. Python offers powerful tools, but users
must ensure responsible practice, fair algorithms, and secure data handling.
122) Which financial application demonstrates the use of natural language processing with Python?
a) Analyzing news articles to gauge market sentiment.
b) Extracting financial information from textual reports.
c) Generating automated financial summaries.
d) All of the above.
Explanation: Option d encompasses all possibilities. Python's NLP capabilities can analyze text data in
finance, extract key information, and even generate reports or summaries.
Data Manipulation:
123) You have a list of lists containing student data like [["John", 80, 95], ["Alice", 90, 75]]. How would
you access the second element (score) of Alice's data?
a) data[1][1]
b) data["Alice"][1]
c) data.Alice[1]
d) data.get("Alice")[1]
Explanation: Option a) uses nested indexing within the list to access the desired element.
124) To calculate the average of all elements in a list [1, 2, 3, 4], which built-in function would you use?
a) sum()
b) mean()
c) average()
d) sum(list) / len(list)
Explanation: Option b) directly calculates the average using the mean() function.
125) If you have a dictionary { "Age": 25, "Name": "John" }, how would you create a new dictionary
with reversed key-value pairs?
a) reversed_dict = dict(zip(dict.values(), dict.keys()))
b) reversed_dict = {value: key for key, value in dict.items()}
c) temp = list(dict.items()); reversed_dict = dict(temp[::-1])
d) None of the above
Explanation: Option b) uses a dictionary comprehension to efficiently swap keys and values.
126) To remove duplicate elements from a list [1, 2, 2, 3, 1], which method would you use?
a) list.remove_duplicates()
b) set(list)
c) list(dict.fromkeys(list))
d) All of the above
Explanation: All options achieve the desired outcome. Option b) creates a set which inherently
removes duplicates, while options c) and d) convert the list to a dictionary or set respectively,
discarding duplicates in the process.
127) You have a list of names and a list of corresponding ages. How would you create a dictionary
mapping names to ages?
a) name_age_dict = dict(zip(names, ages))
b) for i in range(len(names)): name_age_dict[names[i]] = ages[i]
c) name_age_dict = {name: age for name, age in zip(names, ages)}
d) Both a and c
Explanation: Options a) and c) both utilize zip to associate names and ages in a dictionary.
128) To sort a list [3, 5, 1, 4, 2] in descending order, which built-in function would you use?
a) list.sort(reverse=True)
b) sorted(list, reverse=True)
c) list.reverse()
d) None of the above
Explanation: Option a) or b) sorts the list in place or returns a new sorted list, both with the
reverse=True argument specifying descending order.
129) How would you find the maximum element in a list [10, 5, 15, 2]?
a) max(list)
b) list.max()
c) for element in list: largest = max(largest, element)
d) All of the above
Explanation: Option a) and b) directly utilize the max() function to find the largest element.
130) To count the occurrences of each element in a list [1, 2, 2, 3, 1], what would you use?
a) collections.Counter(list)
b) for element in list: count[element] += 1
c) list.count(element)
d) Both a and b
Explanation: Option a) uses the collections.Counter class to efficiently count element occurrences,
while option b) implements a manual loop-based approach.
131) How would you combine two lists [1, 2, 3] and [4, 5, 6] into a single list?
a) list1 + list2
b) list.extend(list2)
c) list.append(list2)
d) All of the above
Explanation: All options achieve the desired outcome. list1 + list2 directly concatenates the lists, while
list.extend(list2) and list.append(list2) append the second list to the first.
132) To check if a specific element (5) exists in a list [1, 2, 3, 4], what method would you use?
a) if 5 in list:
b) list.contains(5)
c) element in list for element in list if element == 5
d) Both a and b
Explanation: Option a) and b) efficiently check for element presence using the in operator or the
list.contains method.
134) To remove leading and trailing whitespace from a string " Hello ", what method would you use?
a) string.strip()
b) string.lstrip()
c) string.rstrip()
d) Both a and b
Explanation: Option a) using string.strip() removes both leading and trailing whitespace, while options
b) and c) target specific sides.
135) How would you split a string "Hello, world" into a list of words?
a) string.split()
b) list(string.split())
c) words = [] for char in string: words.append(char)
d) None of the above
Explanation: Option a) and b) directly utilize the string.split() method to separate the string by the
default delimiter (whitespace).
136) To format a number (10) into a string with two decimal places, what would you use?
a) str(10.00)
b) f"{10:.2f}"
c) "{0:.2f}".format(10)
d) Both b and c
Explanation: Option b) and c) use string formatting techniques (f-strings or format method) to specify
the desired precision of two decimal places.
137) How would you convert a string "True" to a boolean True in Python?
a) bool("True")
b) if string == "True": True
c) int(string)
d) None of the above
Explanation: Option a) using bool("True") directly converts the string to its corresponding boolean
value.
138) To iterate through each element in a list [1, 2, 3], what loop structure would you use?
a) for element in list:
b) while element in list:
c) for i in range(len(list)): print(list[i])
d) All of the above
Explanation: Option a) utilizes the standard for-loop iterating over each element in the list.
139) How would you access the third element (index 2) in a list [1, 2, 3]?
a) list[2]
b) list.get(2)
c) list[len(list)-1]
d) Both a and b
Explanation: Option a) and b) access the element at index 2 using direct indexing or the list.get()
method.
140) To replace all occurrences of "a" with "b" in a string "apple", what method would you use?
a) string.replace("a", "b")
b) string = string.replace("a", "b")
c) for i in range(len(string)): string = string.replace(string[i], "b")
d) Both a and b
Explanation: Option a) and b) directly utilize the string.replace() method to efficiently swap
occurrences of specific characters.
141) To check if a file exists in a particular directory ("myfile.txt"), which function would you use?
a) os.path.exists("myfile.txt")
b) if "myfile.txt" in os.listdir():
c) try: open("myfile.txt") except FileNotFoundError: False
d) All of the above
Explanation: All options achieve the desired outcome. Option a) uses the os.path.exists() function
specifically for checking file existence, while options b) and c) utilize various methods including listing
directory contents or attempting to open the file.
143) To check the data types of columns in a DataFrame, which method would you use?
a) dtypes()
b) dataTypes()
c) types()
d) check_types()
Explanation: a) dtypes() - This method directly returns the data types of all columns in the DataFrame.
144) How would you convert the 'Price' column from string to float in a DataFrame named 'df'?
a) df['Price'] = float(df['Price'])
b) df['Price'] = df['Price'].astype(float)
c) df['Price'].convert_type(float)
d) Both a and b
Explanation: d) Both a and b - Both df['Price'] = float(df['Price']) and df['Price'] = )
df['Price'].astype(float) achieve the same result of converting the 'Price' column from string to float.
145) Which method adds a new column named 'Total' to a DataFrame, calculating the sum of two
existing columns 'A' and 'B'?
a) df['Total'] = df['A'] + df['B']
b) df.add_column('Total', df['A'] + df['B'])
c) df.create_column('Total', df['A'] + df['B'])
d) df.insert_column('Total', df['A'] + df['B'])
Explanation: a) df['Total'] = df['A'] + df['B'] - This directly assigns the sum of 'A' and 'B' to a new
column 'Total'.
146) To concatenate two DataFrames vertically (stacking rows), which method would you use?
a) pd.concat([df1, df2], axis=0)
b) pd.concat([df1, df2], axis=1)
c) df1.concat(df2, vertical=True)
d) df1.append(df2)
Explanation: a) pd.concat([df1, df2], axis=0) - The axis=0 parameter specifies vertical concatenation
(stacking rows) of the two DataFrames
147) Which method merges (similar SQL-like joins) two DataFrames based on a common column?
a) pd.merge(df1, df2, on='common_column')
b) df1.join(df2, on='common_column')
c) df1.merge_with(df2, column='common_column')
d) Both a and b
Explanation: d) Both a and b - Both pd.merge(df1, df2, on='common_column') and df1.join(df2,
on='common_column') perform similar SQL-like joins based on the chosen column.
148) To create a new column named 'Profit' that calculates profit as 'Revenue' minus 'Cost', which
expression would you use?
a) df['Profit'] = df['Revenue'] - df['Cost']
b) df.calculate_column('Profit', df['Revenue'] - df['Cost'])
c) df.profit = df.revenue - df.cost
d) df.add_column('Profit', lambda x: x['Revenue'] - x['Cost'])
Explanation: a) df['Profit'] = df['Revenue'] - df['Cost'] - This directly defines a new 'Profit' column with
the difference between 'Revenue' and 'Cost'.
149) How would you select rows where the 'Age' column is greater than 30 in a DataFrame?
a) df[df['Age'] > 30]
b) df.filter(df['Age'] > 30)
c) df.select(df['Age'] > 30)
d) df.query('Age > 30')
Explanation: a) df[df['Age'] > 30] - This boolean indexing selects rows where the 'Age' condition is
True.
150) To remove rows with missing values (NaN) from a DataFrame, which method would you use?
a) df.dropna()
b) df.remove_nulls()
c) df.filter_missing()
d) df.drop(NaN)
Explanation: a) df.dropna() - This method drops rows containing any missing values (NaN) in the
DataFrame.
151) Which method sorts a DataFrame by the 'Name' column in ascending order?
a) df.sort('Name')
b) df.sort_values('Name')
c) df.arrange('Name')
d) df.order_by('Name')
Explanation: b) df.sort_values('Name') - This method sorts the DataFrame ascendingly by the specified
column ('Name').
152) To group data by the 'Category' column and calculate the mean of the 'Value' column for each
group, which expression would you use?
a) df.groupby('Category')['Value'].mean()
b) df.mean_by('Category', 'Value')
c) df.aggregate(mean=('Value'), by='Category')
d) Both a and c
Explanation: d) Both a and c - Both df.groupby('Category')['Value'].mean() and
df.aggregate(mean=('Value'), by='Category') calculate the mean of the 'Value' column for each group
defined by 'Category'.
153) Which method replaces missing values in a column with the median of that column?
a) df['Column'].fillna(df['Column'].median())
b) df['Column'].replace(NaN, df['Column'].median())
c) df.imputate('Column', median())
d) All of the above
Explanation: d) All of the above - All three methods (fillna, replace, and imputate) can be used to
replace missing values with the median of the column.
154) To rename a column from 'Old_Name' to 'New_Name' in a DataFrame, which method would
you use?
a) df.rename(columns={'Old_Name': 'New_Name'})
b) df['New_Name'] = df['Old_Name']
c) df.rename_column('Old_Name', 'New_Name')
d) df.set_column_name('Old_Name', 'New_Name')
Explanation: a) df.rename(columns={'Old_Name': 'New_Name'}) - This method explicitly renames the
specified column.
155) How would you select the first 5 rows and the last 2 columns of a DataFrame?
a) df.iloc[:5, -2:]
b) df.head(5)[:-2]
c) df.sample(5, axis=1, replace=True)
d) df.loc[0:4, len(df.columns)-2:]
Explanation: a) df.iloc[:5, -2:] - This indexes the first 5 rows ([:5]) and last 2 columns (-2:) using
integer location-based indexing.
157) To check for duplicate rows in a DataFrame, which method would you use?
a) df.duplicates()
b) df.check_duplicates()
c) df.unique()
d) df.is_unique()
Explanation: a) df.duplicates() - This method flags rows that are duplicates based on all columns.
158) How would you create a copy of a DataFrame with modifications only affecting the copy?
a) df_copy = df
b) df_copy = df.copy()
c) df_copy = df[:]
d) All of the above
Explanation: b) df_copy = df.copy() - This creates a new DataFrame (df_copy) that is a copy of the
original (df), allowing modifications without affecting the original.
160) To get descriptive statistics of a numerical column, which method would you use?
a) df['Column'].describe()
b) df.descriptive_stats('Column')
c) df.summary('Column')
d) All of the above
Explanation: d) All of the above - All three methods (df['Column'].describe(),
df.descriptive_stats('Column'), and df.summary('Column')) provide descriptive statistics for the
selected column.
QUESTION 1
You are working as a data scientist in a business consulting firm. Diverse Cars, one of your clients, has provided
you with datasets containing two years' car sales data across Asia, Europe, and the USA. The management of
Diverse Cars wants to analyze these datasets for financial reporting and decision-making purposes.
The datasets provided by Diverse Cars are stored in the CARS.csv and CARSALES.csv files. You use Python to
analyze the data in the Jupyter Notebook.
Required:
In the Jupyter Notebook (CRN.ipynb), write the appropriate Python code for each of the following tasks:
(a) Load the following datasets: (2.5 marks)
(i) CARSALES.csv as csales; and
(ii) CARS.csv as cars
import pandas as pd
# Load datasets
csales = pd.read_csv("CARSALES.csv")
cars = pd.read_csv("CARS.csv")
(b) In csales, change the data type of 'Latest_Launch' from object to 'datetime'. Examine csales to ensure that
the data type has been changed as desired. (04 marks)
csales["Latest_Launch"] = pd.to_datetime(csales["Latest_Launch"])
print(csales["Latest_Launch"].dtype)
(C) Using the 'Latest_Launch' column, add a column to csales namely 'Year'. (03 marks)
csales['Year'] = csales['Latest_Launch'].dt.year
(d) Add a column namely, 'Revenue' and fill it by taking the product of 'Sales_in_thousands' and
'Price_in_thousands'. (2.5 marks)
(e) Use 'Make' and 'Manufacturer' columns of the two datasets cars and csales to create a right join and merge
them. Name the merged data as 'mg_data'. (05 marks)
(f) Using the mg_data, group the data by origin and calculate the total revenue for each origin. Convert the
calculated total revenue to millions. Store the result in a new DataFrame namely, 'origin_revenue'. Print the
'origin_revenue' to show your result. (04 marks)
(g) Create a bar plot showing Revenue by Car Origin. The bar plot should look similar to the following: (04
marks)
# Import matplotlib
plt.figure(figsize=(10, 6))
plt.bar(origin_revenue["Origin"], origin_revenue["Revenue"])
plt.xlabel("Car Origin")
plt.ylabel("Revenue (Millions)")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
Using the information of Diverse Cars scenario, select the most appropriate answer for each of the following
multiple choice question
(h) The number of duplicate values in all columns of csales is: (02 marks)
a) 00
b) 2
c) 05
d) 01
csales_duplicates = csales.duplicated().sum()
(i) The number of null values in the 'Fuel_efficiency' column is: (02 marks)
a) 2
b) 03
c) 04
d) 05
(j) The total number of records in the mg_data is: (02 marks)
a) 66,612
b) 67,196
c) 2,058
d) 2.959
total_records_mg_data = len(mg_data)
QUESTION 2
Read the situation and select the most appropriate answer for each of the given multiple choice questions:
XYZ Limited operates a complex supply chain network involving multiple suppliers, manufacturers, distributors,
and retailers across different regions. The existing supply chain faced issues such as delays, data
inconsistencies, and difficulties in tracking the origin of products Additionally, traditional paper-based
processes and centralized databases led to inefficiencies and a lack of real-time visibility
XYZ Limited aims to enhance the efficiency, transparency, and traceability of its supply chain The primary
objectives are to reduce operational costs, mitigate the risk of counterfeit products, and improve overall
collaboration among stakeholders
XYZ Limited has been advised by its management consultant to use Blockchain technologies as it could
significantly reduce processing time across every stop of This process Each transaction indicating a movement
of goods would be recorded, from raw materials to the finished product Documentation would be created,
updated, viewed or venfied by parties on the blockchain, enabling visibility of the entire supply chain
1. In the finance aspect of the supply chain, how can blockchain streamline payment processes between
different parties, and what role do smart contracts play in this context?
a. By increasing manual verification
b. Through seamless and automated payment execution
c. By introducing complex financial instruments
d. Smart contracts have no relevance in payment processes
Smart contracts facilitate automated execution of predefined terms of agreements, ensuring that payments are
made automatically upon fulfillment of conditions.
2. In a permissioned blockchain environment, what role do consensus algorithms play, and how do they
contribute to the integrity of the supply chain data?
a. They regulate access permissions to the blockchain
b. They establish trust between participants and prevent double spending
c. They determine the order of transactions and ensure agreement among nodes
d. They encrypt data to secure it against unauthorized access
Consensus algorithms are essential in permissioned blockchains to establish agreement among nodes
regarding the validity and order of transactions, thus maintaining the integrity of supply chain data.
3. For a permissioned blockchain environment which one of the following should be considered?
a. Bitcoin
b. Ethereum
c. RIpple
d. Hyperledger
Hyperledger is a permissioned blockchain platform designed for enterprise use, making it suitable for
implementing supply chain solutions.
4. What benefit does a full audit trail on the blockchain provide in the supply chain?
a. Increased consumer demand
b. Protection against counterfeit goods
c. Inefficient payment processing
d. Lack of visibility
A full audit trail on the blockchain ensures transparency and traceability, helping to prevent the circulation of
counterfeit goods by providing visibility into the entire supply chain process.
5. What role do loT devices play in enhancing the blockchain-based supply chain system?
a. Ensuring data privacy
b. Providing visibility
c. Facilitating smart contract execution
d. Enhancing traceability
IoT devices can capture real-time data on the movement and condition of goods, enhancing traceability within
the supply chain and ensuring that accurate information is recorded on the blockchain.
6. What role do loT devices play in ensuring product quality within the supply chain?
a. Reduces delay in production
b. Automates order placement
c. Facilitates smart contract execution
d. Provides real-time data on the condition of goods
IoT devices can monitor various parameters such as temperature, humidity, and location in real-time, providing
insights into the quality and condition of products throughout the supply chain.
QUESTION 3
Read the situation and select the most appropriate answer for each of the given multiple choice questions:
NexTech Solutions is engaged in a high-stakes project for ZetaCorp, a global fintech leader facing sophisticated
cyber threats Led by experienced Project Manager Ali, the team is tackling advanced threats targeting
ZetaCorp's proprietary algorithms and financial transaction systems. The project involves a comprehensive risk
management strategy, focusing on emerging threats in digital identity theft and advanced persistent threats
(APTS)
Ali's team is conducting an array of penetration tests simulating state-sponsored cyber-attacks, aiming to
identify zero-day vulnerabilities and complex exploit chains The discovered vulnerabilities in ZetaCorp's systems
are subtle and deeply embedded, requiring nuanced understanding and innovative countermeasures
Recognizing the potential for catastrophic data breaches, Ali emphasizes the need for an adaptive and layered
approach to disaster recovery and business continuity, capable of responding to multi-vector attacks and
ensuring operational resilience under extreme scenarios
1. In the context of advanced persistent threats (APTS), what is the significance of 'dwell time'?
a. The time taken to completely remove the threat from the system
b. The period during which a threat remains undetected within a network
c. The duration of a standard penetration test
d. The time required for a system reboot after an attack
Ali's approach focuses on ensuring redundancy and resilience in ZetaCorp's defense mechanisms, allowing the
organization to withstand attacks from various angles.
2. Ali's layered approach to disaster recovery (if one layer of defense is compromised, others can still
provide protection) for ZetaCorp primarily addresses what aspect of cybersecurity?
a. Single-point solution for all cyber threats
b. Redundancy and resilience in the face of diverse attack vectors
c. Compliance with basic security standards
d. Focus on external threats only
Ali's approach focuses on ensuring redundancy and resilience in ZetaCorp's defense mechanisms, allowing the
organization to withstand attacks from various angles.
Authentication weakness typically refers to vulnerabilities associated with the authentication process, such as
weak passwords, phishing attacks, or social engineering exploits, which primarily target human users.
A comprehensive risk management strategy should address both internal and external threat vectors to
effectively mitigate potential risks and vulnerabilities.
5. The adaptability and comprehensiveness of response plans are challenged by multi-vector attacks due
to:
a. The simplicity of these attacks
b. Their predictable nature
c. The need to address simultaneous threats across different domains
d. The focus on only digital assets
Multi-vector attacks involve attackers targeting an organization's systems and networks through various
methods simultaneously, making it challenging to respond comprehensively due to the need to address threats
across different domains concurrently.
Longer dwell times indicate that threat actors have more time to explore, gather sensitive information, and
execute potentially damaging actions within a network, increasing the potential impact of the attack.