Improving Data Exploration

mcross1882

Member
Joined
Sep 12, 2022
Messages
57
Location
Texas
Aircraft
Aviomania G1sB
Howdy everyone, I wanted to reach out to the moderators and admins to see if it would be possible to work with y'all to improve the search functionality of all the data on the forum.

IRL I do data engineering and would love to donate my time to help improve this problem. Right now I was thinking about indexing the knowledge and informational threads so that we can easily do a text search for content. However without intense scraping -- which likely violates site policy -- my options are limited. I was curious if I might be able to obtain a data dump of certain parts of the forums for loading into a text optimized database.

I can also translate the data so that it could be plugged into ChatGPT for conversational search. Either way if any of the mods or admins would like to discuss it further please thread in or DM me!

Best regards,
Matt
 

DonBishop

Active Member
Joined
Feb 20, 2020
Messages
133
Location
Grizzly Flats
Aircraft
Mooney M20J
Total Flight Time
1200 hours
AI is fascinating. The BING AI is already tired of humans, while the ChatGPT is still civil. What framework would the mods have to install to implement all of what you want to do?
 

MikeBoyette

Gold Member
Joined
Oct 30, 2003
Messages
3,266
Location
Plant City, Fl
Aircraft
Dominator
Total Flight Time
200+
Just remember James Cameron has predicted what is coming. Skynet could be closer than we know. I heard Google had to shut some of their AI down because it became self aware.
 

Vance

Gyroplane CFI
Staff member
Joined
Oct 30, 2003
Messages
18,134
Location
Santa Maria, California
Aircraft
Givens Predator
Total Flight Time
2600+ in rotorcraft
Howdy everyone, I wanted to reach out to the moderators and admins to see if it would be possible to work with y'all to improve the search functionality of all the data on the forum.

IRL I do data engineering and would love to donate my time to help improve this problem. Right now I was thinking about indexing the knowledge and informational threads so that we can easily do a text search for content. However without intense scraping -- which likely violates site policy -- my options are limited. I was curious if I might be able to obtain a data dump of certain parts of the forums for loading into a text optimized database.

I can also translate the data so that it could be plugged into ChatGPT for conversational search. Either way if any of the mods or admins would like to discuss it further please thread in or DM me!

Best regards,
Matt
Thank you for the kind offer Matt.

What you describe is well outside of what I know about.

As a moderator I mostly just keep people inside the lines (no politics, no religion and no personal insults)

Kevin may know more.

I use search engines outside of the Rotary Wing Forum if I am looking for a particular something.
 

mcross1882

Member
Joined
Sep 12, 2022
Messages
57
Location
Texas
Aircraft
Aviomania G1sB
Howdy team! Appreciate all the quick feedback here I can give some more insights into what I was thinking here.

AI is fascinating. The BING AI is already tired of humans, while the ChatGPT is still civil. What framework would the mods have to install to implement all of what you want to do?
I think we would be able to forgo any mods or extensions to the forum. Most of the times these platforms provide a way to bulk export data in some messy format such as HTML or raw text files (or worst case we could export the database tables as flat files).

All I would need on my end is the textual data from the site and probably filtered down to categories that contain a lot of informational posts. General discussion threads can muck up the search engine results pretty easily.

To start I would probably import the data into PostgreSQL but in an optimized format where all the textual data is converted to normalized vectors. This allows me to implement more advanced search functions that can be based off relevance vs simple keyword matching. We could then provide an endpoint that would allow people to search and lookup data within the database. Thinking out loud we could also program it to return the URL of the page where it found the source info too so people can link directly back into the forum. Thinking to keep it as simple as possible at first and just make the accessibility a lot easier. Good examples would be locating instructions on this forum for specific repairs or part names/numbers for specific gyro models.

The ChatGPT has a lot of potential but is a bit more pricey. I would need to do some math to calculate how much it would cost me to translate it all based on the amount of text data on the forum. However it probably has the most power as I have an example conversation below. I don't take its output as the absolute truth but it is amazing as a starting point for locating documentation or basic knowledge sourcing.
 

Attachments

  • Screenshot from 2023-02-20 12-13-31.png
    Screenshot from 2023-02-20 12-13-31.png
    52.8 KB · Views: 7

Kevin_Richey

Moderator
Staff member
Joined
Nov 16, 2003
Messages
3,025
Location
N. Central AZ @ 4,500'
Thank you for the kind offer Matt.

What you describe is well outside of what I know about.

As a moderator I mostly just keep people inside the lines (no politics, no religion and no personal insults)

Kevin may know more...
I don't know a thing further, either.
I'm in the same box as Vance regarding the extent of my moderator job, too!
I don't possess nor claim to know extensive knowledge of computing ways.

The older I get, the more I deeply realize how little I really do know, compared to many others...!
I'm still in wonder how auto-rotation is so much fun & such a simple way to fly, when flown w/in it's proper flight envelope.

Send the forum owner, Todd Powell, an email referencing this post of your offering the path you've outlined. He isn't actively monitoring the forum, so communicating through his email address should work. ([email protected])

He usually gets back to you w/in several days. You might need to 2nd email if no reply beyond then...
 
Last edited:

Sv.grainne

Super Member
Joined
Jan 14, 2020
Messages
2,034
Location
Kerrville, Texas
Aircraft
Aviomania, G1sB Genesis
Matt:

Good luck, there is no activive site admin. Todd holds the license for the forum but I do not think he does any admin. I've tried to get involved but no response
 
Top