MAHBUB HOMEPAGE

Data Analysis Project¶


Project Title: Numerical Integration of Mass-flux Weighted Average Mach Number of a Hypersonic Flow Using Python


Project Description¶

Supersonic Combustion Ramjet (Scramjet) has recently drawn the attention of scientists and researchers due to its absence of moving parts and high speed (Mach 5 or above). Both numerical and experimental approaches have been taken over the years to test and develop such speedy vehicles. This project deals with such numerical simulation data ( simulation of the Scramjet combustor's flow field was performed on ANSYS Fluent 22.0 ).

Project Goal:¶

The aim of this protect is to find out the speed zone of a Scramjet using the simulation data of its ignition chamber.

Project Outlines:¶

  • To take input of the simulation data.
  • To manipulate the input data.
  • To perform numerical integration using integral equation used by Drozda et al. (Tomasz G. Drozda, Jacob J. Lampenfield, Rohan Deshmukh, Robert A. Baurle and J. Philip Drummond.) Source.
  • To show the output result of the numerical integration.
  • To visualize it to show the Mach number across the streamwise location and investigate the speed zone of the air vehicle.

Problem Solution Approach¶

We are going to use the six phases of Data Analytics process, such as, Ask, Prepare, Process, Analyze, Share and Act to achieve the project goal.

1. Ask ( Problem Definition):¶

The problem in this project is to evaluate the speed zone of the air vehicle. So, the question at hand might be how do we show the speed of the air vehicle. To answer this question, we have to collect the speed data. Since the hypersonic flow is compressible, temperature, density, and pressure all these parameters keep fluctuating. Hence, we will use Mach number in stead of velocity because Mach number is a function of temperature and density. Also, temperature and pressure go hand in hand with each other.

2. Prepare ( Collection of Data ):¶

In this phase of data analysis, analysts collect relevent data. Luckily for us, we already have relevent data in the simulation dataset.

3. Process ( Cleaning, Removing Inaccuacies or Duplicates, Formatting, or Manipulating Data ):¶

In the third phase of data analysis, we will prepare the simulation data for numerical integration model. We will clean the data by removing null values or duplicate values. Then we will sort the data and format the dataset properly so that it is ready to be fed into the model. Let's take a step-by-step approach. Firstly, let us import the python libraries and create a DataFrame from simulation dataset.

In [1]:
# Importing all python libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
In [2]:
# Reading csv file and creating DataFrame

initial_data = pd.read_csv("E:/All/Codes/PR120P.csv")
initial_data
Out[2]:
x_coordinate y_coordinate density x_velocity f_h2 Total_Pressure Mach f_air
0 0.121259 0.002284 0.297909 2688.629639 1.145280e-04 3.575866e+07 4.287759 0.999885
1 0.120993 0.002284 0.299933 2687.856445 1.149930e-04 3.580952e+07 4.282809 0.999885
2 0.121259 0.002160 0.297353 2687.101563 1.099640e-04 3.533384e+07 4.279319 0.999890
3 0.120993 0.002160 0.299370 2686.313965 1.103610e-04 3.538020e+07 4.274288 0.999890
4 0.121259 0.002405 0.298466 2690.105225 1.193370e-04 3.618473e+07 4.296156 0.999881
... ... ... ... ... ... ... ... ...
120476 0.200000 0.020000 0.119235 0.000000 6.617128e-03 2.405563e+05 0.398403 0.993383
120477 0.200000 0.000000 0.571604 2525.432617 3.680000e-05 2.679004e+07 3.492247 0.999963
120478 0.000000 0.020000 0.350834 0.000000 6.417983e-02 3.194910e+05 0.011509 0.935820
120479 0.040000 0.028000 0.081500 0.000000 4.023827e-01 2.479549e+05 0.000241 0.597617
120480 0.000000 0.000000 0.000699 0.000000 5.900000e-22 6.291669e+02 0.008223 1.000000

120481 rows × 8 columns

Clearning:¶

Now, let's remove any null data from the DataFrame.

In [3]:
# Checking and removing null values

null_chceked_data = initial_data.dropna()
null_chceked_data
Out[3]:
x_coordinate y_coordinate density x_velocity f_h2 Total_Pressure Mach f_air
0 0.121259 0.002284 0.297909 2688.629639 1.145280e-04 3.575866e+07 4.287759 0.999885
1 0.120993 0.002284 0.299933 2687.856445 1.149930e-04 3.580952e+07 4.282809 0.999885
2 0.121259 0.002160 0.297353 2687.101563 1.099640e-04 3.533384e+07 4.279319 0.999890
3 0.120993 0.002160 0.299370 2686.313965 1.103610e-04 3.538020e+07 4.274288 0.999890
4 0.121259 0.002405 0.298466 2690.105225 1.193370e-04 3.618473e+07 4.296156 0.999881
... ... ... ... ... ... ... ... ...
120476 0.200000 0.020000 0.119235 0.000000 6.617128e-03 2.405563e+05 0.398403 0.993383
120477 0.200000 0.000000 0.571604 2525.432617 3.680000e-05 2.679004e+07 3.492247 0.999963
120478 0.000000 0.020000 0.350834 0.000000 6.417983e-02 3.194910e+05 0.011509 0.935820
120479 0.040000 0.028000 0.081500 0.000000 4.023827e-01 2.479549e+05 0.000241 0.597617
120480 0.000000 0.000000 0.000699 0.000000 5.900000e-22 6.291669e+02 0.008223 1.000000

120481 rows × 8 columns

Since the number of rows in the DataFrame before dropping null values and after dropping the null values were exactly the same ( 120481 rows ), hence, there were no null values in the dataset. Let us check for duplicates now.

In [4]:
# Checking for duplicate values

check_for_duplicate_data = null_chceked_data.copy()
check_for_duplicate_data['Check_duplicate'] = check_for_duplicate_data.duplicated()
check_for_duplicate_data
Out[4]:
x_coordinate y_coordinate density x_velocity f_h2 Total_Pressure Mach f_air Check_duplicate
0 0.121259 0.002284 0.297909 2688.629639 1.145280e-04 3.575866e+07 4.287759 0.999885 False
1 0.120993 0.002284 0.299933 2687.856445 1.149930e-04 3.580952e+07 4.282809 0.999885 False
2 0.121259 0.002160 0.297353 2687.101563 1.099640e-04 3.533384e+07 4.279319 0.999890 False
3 0.120993 0.002160 0.299370 2686.313965 1.103610e-04 3.538020e+07 4.274288 0.999890 False
4 0.121259 0.002405 0.298466 2690.105225 1.193370e-04 3.618473e+07 4.296156 0.999881 False
... ... ... ... ... ... ... ... ... ...
120476 0.200000 0.020000 0.119235 0.000000 6.617128e-03 2.405563e+05 0.398403 0.993383 False
120477 0.200000 0.000000 0.571604 2525.432617 3.680000e-05 2.679004e+07 3.492247 0.999963 False
120478 0.000000 0.020000 0.350834 0.000000 6.417983e-02 3.194910e+05 0.011509 0.935820 False
120479 0.040000 0.028000 0.081500 0.000000 4.023827e-01 2.479549e+05 0.000241 0.597617 False
120480 0.000000 0.000000 0.000699 0.000000 5.900000e-22 6.291669e+02 0.008223 1.000000 False

120481 rows × 9 columns

In [5]:
# Showing the duplicate values

check_for_duplicate_data[check_for_duplicate_data['Check_duplicate']==True]
Out[5]:
x_coordinate y_coordinate density x_velocity f_h2 Total_Pressure Mach f_air Check_duplicate

So, it appears from the output that there are no duplicate values in the current simulation dataset. However, we will write the code which would remove duplicates from the dataset so that even if there are duplicates in different datasets, we can remove those duplicates.

In [6]:
# Removing the duplicate values

duplicate_removed_data = null_chceked_data.drop_duplicates()
duplicate_removed_data
Out[6]:
x_coordinate y_coordinate density x_velocity f_h2 Total_Pressure Mach f_air
0 0.121259 0.002284 0.297909 2688.629639 1.145280e-04 3.575866e+07 4.287759 0.999885
1 0.120993 0.002284 0.299933 2687.856445 1.149930e-04 3.580952e+07 4.282809 0.999885
2 0.121259 0.002160 0.297353 2687.101563 1.099640e-04 3.533384e+07 4.279319 0.999890
3 0.120993 0.002160 0.299370 2686.313965 1.103610e-04 3.538020e+07 4.274288 0.999890
4 0.121259 0.002405 0.298466 2690.105225 1.193370e-04 3.618473e+07 4.296156 0.999881
... ... ... ... ... ... ... ... ...
120476 0.200000 0.020000 0.119235 0.000000 6.617128e-03 2.405563e+05 0.398403 0.993383
120477 0.200000 0.000000 0.571604 2525.432617 3.680000e-05 2.679004e+07 3.492247 0.999963
120478 0.000000 0.020000 0.350834 0.000000 6.417983e-02 3.194910e+05 0.011509 0.935820
120479 0.040000 0.028000 0.081500 0.000000 4.023827e-01 2.479549e+05 0.000241 0.597617
120480 0.000000 0.000000 0.000699 0.000000 5.900000e-22 6.291669e+02 0.008223 1.000000

120481 rows × 8 columns

Sorting and Formatting:¶

Now, since we have removed both duplicates and null values, let us sort the data first by column x_coordinate and then by y_coordinate. After that,let us reset the indexes which will make our dataset ready for analyze phase.

In [7]:
# Sorting data in DataFrame

data = duplicate_removed_data.sort_values(by=["x_coordinate","y_coordinate"])
data
Out[7]:
x_coordinate y_coordinate density x_velocity f_h2 Total_Pressure Mach f_air
120480 0.0 0.000000 0.000699 0.000000 5.900000e-22 629.166870 0.008223 1.000000
35663 0.0 0.000184 0.000695 0.000000 5.900000e-22 625.543518 0.010712 1.000000
34974 0.0 0.000363 0.000686 0.000000 5.900000e-22 616.928162 0.016225 1.000000
34283 0.0 0.000537 0.000673 0.000000 5.900000e-22 605.727539 0.021991 1.000000
33590 0.0 0.000707 0.000659 0.000000 5.910000e-22 592.503418 0.026975 1.000000
... ... ... ... ... ... ... ... ...
16553 0.2 0.019784 0.149142 1048.098755 0.000000e+00 313607.250000 0.755955 1.000000
18055 0.2 0.019842 0.144697 905.724548 0.000000e+00 284324.125000 0.644490 1.000000
19585 0.2 0.019897 0.138128 802.791992 0.000000e+00 266158.218800 0.559121 1.000000
21090 0.2 0.019950 0.126462 686.678040 0.000000e+00 249958.140600 0.460395 1.000000
120476 0.2 0.020000 0.119235 0.000000 6.617128e-03 240556.265600 0.398403 0.993383

120481 rows × 8 columns

In [8]:
# Reseting the indexes

data = data.reset_index(drop=True)
data
Out[8]:
x_coordinate y_coordinate density x_velocity f_h2 Total_Pressure Mach f_air
0 0.0 0.000000 0.000699 0.000000 5.900000e-22 629.166870 0.008223 1.000000
1 0.0 0.000184 0.000695 0.000000 5.900000e-22 625.543518 0.010712 1.000000
2 0.0 0.000363 0.000686 0.000000 5.900000e-22 616.928162 0.016225 1.000000
3 0.0 0.000537 0.000673 0.000000 5.900000e-22 605.727539 0.021991 1.000000
4 0.0 0.000707 0.000659 0.000000 5.910000e-22 592.503418 0.026975 1.000000
... ... ... ... ... ... ... ... ...
120476 0.2 0.019784 0.149142 1048.098755 0.000000e+00 313607.250000 0.755955 1.000000
120477 0.2 0.019842 0.144697 905.724548 0.000000e+00 284324.125000 0.644490 1.000000
120478 0.2 0.019897 0.138128 802.791992 0.000000e+00 266158.218800 0.559121 1.000000
120479 0.2 0.019950 0.126462 686.678040 0.000000e+00 249958.140600 0.460395 1.000000
120480 0.2 0.020000 0.119235 0.000000 6.617128e-03 240556.265600 0.398403 0.993383

120481 rows × 8 columns

4. Analyze ( Building Model, Testing and Analyzing, or Using Previous Model and Analyzing ):¶

In the fourth phase of data analysis, analyzing the data collected involves using tools to transform and organize the information so that we can draw conclusions. It involves using an analytic approach to build models and using this model to test and evaluate the data to find out the results.

Modelling:¶

We will use integral equation introduced by Drozda et al. to calculate total pressure recovery in a hypersonic flow field which can be expressed as,

$$ M_{avg} ={\frac{\int M \rho u dA}{\int \rho u dA}}$$

where,

$ M_{avg}$ = mass-flux weighted average Mach number in the flow field at definite x-location

$ M $ = local Mach number ( at a definite node ) [$\frac {\text{scramjet velocity}}{\text{sound velocity}}$]

$\rho $ = local density of the fuel

$ u $ = local x-velocity of the flow field

$ dA $ = local cross-sectional area perpendicular to the flow field

Mach_D.PNG

During the simulation, the maximum width and the height of the ignitor have been divided into (r-1) and (n-1) segments respectively. Vertical and horizontal lines have been drawn from the corner points of each segment and intersections of these lines are the nodal points. Data have been collected from each nodal point and we are using those data. We have denoted the points on the width ( x-axis ) as $x_{1}$, $x_{2}$, $x_{3}$ $........$ $x_{r}$ respectively and the points on the height ( y-axis ) to be $y_{1}$, $y_{2}$, $y_{3}$ $........$ $y_{n}$ respectively.

For each particular x-position, numerical integration is performed by dividing the numerator term with the denominator term. The numerator term is obtained by summing up the product of local Mach number, local density of the fuel, local x-velocity of the flow field and local cross-sectional area perpendicular to the flow field, while the denominator term is calculated by summing up the product of local density of the fuel, local x-velocity of the flow field and local cross-sectional area perpendicular to the flow field.

For instance, for horizontal position $x_{m}$, there are N number of nodes ( for each N number of y positions ) which have been shown by the blue dots. Node 1 is located at position ($x_{m}$ , $y_{1}$), Node 2 is located at position ($x_{m}$ , $y_{2}$) and similarly, Node N is located at position ($x_{m}$ , $y_{n}$). Now, the local Mach number, local density, local velocity and local cross-sectional area at Node 1 can be expressed as, $M_{m}^{1}$ , $\rho_{m}^{1}$ , $u_{m}^{1}$ and $dA_{m}^{1}$ respectively. Similarly, these terms at Node 2 , $......$ , and Node N can be expressed as, ( $M_{m}^{2}$ , $\rho_{m}^{2}$ , $u_{m}^{2}$, $dA_{m}^{2}$ ) , $......$ , and ( $M_{m}^{n}$ , $\rho_{m}^{n}$ , $u_{m}^{n}$, $dA_{m}^{n}$ ) respectively. Therefore, for x-position $x_{m}$ , mass-flux weighted average Mach number can be calculated as,

$$ \Bigg[M_{avg}\Bigg]_{x_{m}} = \frac {(M_{m}^{1} \rho_{m}^{1} u_{m}^{1} dA_{m}^{1}+M_{m}^{2} \rho_{m}^{2} u_{m}^{2} dA_{m}^{2}+.....+M_{m}^{n} \rho_{m}^{n} u_{m}^{n} dA_{m}^{n})} {( \rho_{m}^{1} u_{m}^{1} dA_{m}^{1} + \rho_{m}^{2} u_{m}^{2} dA_{m}^{2}+.....+ \rho_{m}^{n} u_{m}^{n} dA_{m}^{n} )} = \frac {\sum \limits_{i=1}^{n} M_{m}^{i} \rho_{m}^{i} u_{m}^{i} dA_{m}^{i}} {\sum \limits_{i=1}^{n}\rho_{m}^{i} u_{m}^{i} dA_{m}^{i}} $$

Now, we will write the code to generate a function which can calculate the mass-flux weighted average Mach number at each x-location by using the above equation. But before that, we will find out the unique x-coordinates from the DataFrame.

Unique X-Coordinates:

In [9]:
# Calculating number of unique x_data along the streamwise direction(x_direction)

row_num = data.shape[0]   # Returns the number of rows
print("Row Number: {}".format(row_num))

x_dataset = pd.DataFrame(columns=['unique_x_coordinate'])
i=0

for n in range(row_num):
    if n < (row_num-1):
         if data['x_coordinate'].values[n] != data['x_coordinate'].values[n+1]:
                x_dataset.loc[i,"unique_x_coordinate"] = data['x_coordinate'].values[n]
                i=i+1
    elif n==(row_num-1):
        x_dataset.loc[i,"unique_x_coordinate"] = data['x_coordinate'].values[row_num-1]

x_dataset
Row Number: 120481
Out[9]:
unique_x_coordinate
0 0.0
1 0.000107
2 0.000218
3 0.000332
4 0.000449
... ...
546 0.195988
547 0.196977
548 0.197975
549 0.198983
550 0.2

551 rows × 1 columns

We will now generate a Class to determine mass-flux weighted average Mach number.

In [10]:
# Class for performing numerical integration at individual streamwise(x) location.

class Mach_Scramjet():
    def __init__(self,data,x_dataset):
        self.x_dataset = x_dataset
        self.data = data
        self.row_unique = x_dataset.shape[0]
        self.Result_dataset = pd.DataFrame(columns=['x_coordinate','Mass_Flux_Average_Mach'])
        
    # cal_dataset refers to calculation dataset    
    def mass_flux_Mach(self):
        print("Performing Calculations... Please Wait ...")
        for ii in range(self.row_unique):
            self.cal_dataset = self.data[self.x_dataset.loc[ii,"unique_x_coordinate"]==self.data.loc[:,"x_coordinate"]]
            self.cal_dataset = self.cal_dataset.reset_index(drop=True)
            self.cal_dataset_row = self.cal_dataset.shape[0]
                    
            #Finding Area
            for kk in range(self.cal_dataset_row):
                if kk>0 :
                    self.cal_dataset.loc[kk,'y_difference'] = (self.cal_dataset.loc[kk,'y_coordinate']
                                                               -self.cal_dataset.loc[kk-1,'y_coordinate'])
                elif kk==0:
                    self.cal_dataset.loc[kk,'y_difference'] = self.cal_dataset.loc[kk,'y_coordinate']
            
            for m in range(self.cal_dataset_row):
                if m==0 :
                    self.cal_dataset.loc[m,'dA'] = self.cal_dataset.loc[m,'y_difference']/2
                elif m < (self.cal_dataset_row - 1):
                    self.cal_dataset.loc[m,"dA"] = (self.cal_dataset.loc[m,'y_difference']
                                                    +self.cal_dataset.loc[m+1,'y_difference'])/2
                elif m == (self.cal_dataset_row-1) :
                    self.cal_dataset.loc[m,'dA'] = self.cal_dataset.loc[m,'y_difference']/2
            
            # Integral Individual Result Data (Mach terms corresponding to each x_coordinate)
            for p in range(self.cal_dataset_row):
                self.cal_dataset.loc[p,'Mach_numerator'] = (self.cal_dataset.loc[p,'density']*
                    self.cal_dataset.loc[p,'x_velocity']*self.cal_dataset.loc[p,'Mach']*self.cal_dataset.loc[p,'dA'])
                self.cal_dataset.loc[p,'Mach_denominator'] = (self.cal_dataset.loc[p,'density']*
                                                   self.cal_dataset.loc[p,'x_velocity']*self.cal_dataset.loc[p,'dA'])
    
            self.total_numerator = sum(self.cal_dataset.loc[:,'Mach_numerator'])
            self.total_denominator = sum(self.cal_dataset.loc[:,'Mach_denominator'])
            self.total_result = self.total_numerator/self.total_denominator
    
    
    
            self.Result_dataset.loc[ii,'x_coordinate'] = self.x_dataset.loc[ii,'unique_x_coordinate']
            self.Result_dataset.loc[ii,'Mass_Flux_Average_Mach'] = self.total_result
               
        print("\033[92mCalculation Completed!")

        return self.Result_dataset
    
In [11]:
# Calculating the mixing efficiency

Result_calculator = Mach_Scramjet(data,x_dataset)
Result_dataset = Result_calculator.mass_flux_Mach()
Result_dataset
Performing Calculations... Please Wait ...
Calculation Completed!
Out[11]:
x_coordinate Mass_Flux_Average_Mach
0 0.0 4.991847
1 0.000107 4.984752
2 0.000218 4.982467
3 0.000332 4.979782
4 0.000449 4.977022
... ... ...
546 0.195988 2.946699
547 0.196977 2.950356
548 0.197975 2.953961
549 0.198983 2.957498
550 0.2 2.959202

551 rows × 2 columns

5. Share ( Result Summarization, Visualization, Recommendations ):¶

Let us plot the result dataset across the streamwise location to visualize Mach number. From the graph, we would be able to evaluate the speed zone of the air vehicle.

In [12]:
# Plotting the final graph

# Extracting the result_dataset
x_data = Result_dataset.loc[:,'x_coordinate']
y_data = Result_dataset.loc[:,"Mass_Flux_Average_Mach"]

# Specifying the fonts and fontproperties for title,labels and ticks
font = {'family': ['Times New Roman','sherif'],
        'color':  'black',
        'weight': 'bold',
        'size': 14,
        }
font_tick = {
    "family":"Times New Roman",
    "size":12,
    "weight":"heavy",
    "style":"normal"
}

# Plotting the result_data
plt.figure(figsize=(16,6))
plt.plot(x_data,y_data,"r")

# Manipulating the title, labels and legend
plt.title("Mass flux weighted average Mach number along streamwise location",fontdict=font,size=18,fontweight='bold',pad=10)
plt.xlabel('Streamwise location (x coordinate)',fontdict=font,labelpad=15)
plt.ylabel('Mach',fontdict=font,labelpad=15)
plt.legend(["Pressure ratio 12"],loc="upper right",shadow=True,edgecolor='black',borderpad=0.6,prop={'weight':"normal"})

# Rearranging and specifying the ticks
plt.xticks(np.arange(0,0.200003,0.02),fontproperties=font_tick)
plt.yticks(np.arange(0,6.01,1),fontproperties=font_tick)

# Specifying the limits and generating grids
plt.xlim(0,0.2)
plt.ylim(0,6)
plt.grid(True)

# Showing the plot
plt.show()

Key Takedowns:¶

  • Inside the combustor, the mass-flux weighted average Mach reduces upto 2.7.
  • At the exit, Mach value reaches close to3.0.
  • The whole ignitor operates in between approximate values 5.00 and 2.7.
  • The whole ignitor operates in hypersonic zone.

6. Act ( Implementation, Decision Making and Feedback on Model ):¶

In the final phase of the Data Analytics project, it's time for decision making. However, this project does not emphasize much on decision making. Rather, it gives importance to key findings. The research team leader (stakeholder) based on current data analysis comes to the conclusion that the vehicle operates in hypersonic zone.

Resources:¶

To explore the notebook, visit : github

Export the datasets from the following cell.

In [13]:
# Extract the output file

from IPython.display import display
import ipywidgets as widgets

def download_result_Mach(event):
    data.to_csv("C:/Users/mahbu/Downloads/Input_dataset_Mach.csv")
    print("\033[92m Input Dataset Downloaded Successfully!")
    Result_dataset.to_csv("C:/Users/mahbu/Downloads/Result_PR120M.csv")
    print("\033[92m Mach Result Dataset Downloaded Successfully!")
    
csv_download_button_M = widgets.Button(description='Export Data',disabled=False,
                                     style=dict(button_color='#d7dadd',font_weight='bold'))
csv_download_button_M.on_click(download_result_Mach)
display(csv_download_button_M)
Button(description='Export Data', style=ButtonStyle(button_color='#d7dadd', font_weight='bold'))
 Input Dataset Downloaded Successfully!
 Mach Result Dataset Downloaded Successfully!

Authors:¶

Md. Mahbub Talukder,
BSc. in Mechanical Engineering,
Bangladesh University of Engineering and Technology.