in

Unable to reset array in asynchronous Node.js function despite using Callbacks


So I am trying to create a basic web scrapper API, and I am using a common function PullArticle() with a nested loop to scrape articles with keywords (different sets of articles and different sets of keywords) depending on the GET request I send.

I need to reset the callback variable “article” between GET requests to keep the data separate, but it just adds on previous calls to each GET request which can duplicate data if I call the same request twice.

I have tried to use a callback function and previously a promise function on advice of StackOverFlow, as I was under the impression that the Topic.forEach function was running asynchronously causing the returned “article” to just return empty; however, I haven’t been able to get it to work no matter what, and I was hoping somebody can point out what I’m, doing wrong here.

var article = []

 function PullArticle (Topic, myCallback) {
article =[] // IF I LEAVE THIS RESET OUT ARRAY RETURNS EMPTY :(
  
     Topic.forEach(TopicLoop => {    
       newspapers.forEach(newspapers =>{
                axios.get(newspapers.address) // pulling html
                .then((response)=>{
                    const html = response.data
                    const $ = cheerio.load(html) //allows to pickout elements
                    $(`a:contains(${TopicLoop})`,html).each(function () { 
                        const title = $(this).text()
                        const url = $(this).attr('href')
                        article.push ({
                            title, 
                            url: newspapers.base + url,
                            source: newspapers.name,
                            
                        })
               
                    })
                })
            })
    })
 let sendback = article

myCallback(sendback)
 }

In the same file I make a get request with

app.get('/TopicMatrix1',(req,res) =>{
    PullArticle( Topic1, myDisplayer)
    function myDisplayer (PrintArticle){
        res.json(PrintArticle)
    } 
})
app.get('/SomeOtherTopic',(req,res) =>{
PullArticle()
etc
}

Also does anyone know why I can’t make the function myDisplayer(), which prints out res.json a common function sitting outside the GET request, so separate GET requests can call it?



Source: https://stackoverflow.com/questions/70553508/unable-to-reset-array-in-asynchronous-node-js-function-despite-using-callbacks

Bracket not showing as closed in Python 3

Update the copyright year in all of your GitHub repos